meta-llama/llama-3.2-1b-instruct
Llama 3.2 1B is a 1-billion-parameter language model focused on efficiently performing natural language tasks, such as summarization, dialogue, and multilingual text analysis. Its smaller size allows it to operate...
| Provider | In $/M | Out $/M | Context | Uptime |
|---|---|---|---|---|
| Cloudflare | $0.027 | $0.201 | 60K | — |
Llama 3.2 1B Instruct costs $0.027 per million input tokens and $0.201 per million output tokens via OpenRouter, making it 5th cheapest of 298 paid models.
Llama 3.2 1B Instruct scores 6.3 on the Artificial Analysis Intelligence Index, ranking 178th of 178 benchmarked models, with a GPQA Diamond score of 20%.
Llama 3.2 1B Instruct generates around 169 tokens per second with 332ms time-to-first-token (p50), the 24th fastest tracked model.
Llama 3.2 1B Instruct supports a 131K-token context window and can output up to 60K tokens. It accepts text input.