meta-llama/llama-3.2-3b-instruct
Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for advanced natural language processing tasks like dialogue generation, reasoning, and summarization. Designed with the latest transformer architecture, it...
| Provider | In $/M | Out $/M | Context | Uptime |
|---|---|---|---|---|
| Cloudflare | $0.051 | $0.335 | 80K | 100% |
Llama 3.2 3B Instruct costs $0.051 per million input tokens and $0.335 per million output tokens via OpenRouter, making it 22nd cheapest of 298 paid models.
Llama 3.2 3B Instruct generates around 102 tokens per second with 223ms time-to-first-token (p50), the 63rd fastest tracked model.
Llama 3.2 3B Instruct supports a 131K-token context window and can output up to 80K tokens. It accepts text input.