meta-llama/llama-4-scout
Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model developed by Meta, activating 17 billion parameters out of a total of 109B. It supports native multimodal input...
| Provider | In $/M | Out $/M | Context | Uptime |
|---|---|---|---|---|
| DeepInfrafp8 | $0.100 | $0.300 | 328K | 100% |
| Groq | $0.110 | $0.340 | 131K | 99.8% |
| Novitabf16 | $0.180 | $0.590 | 131K | 100% |
| $0.250 | $0.700 | 1.3M | 100% |
Llama 4 Scout costs $0.100 per million input tokens and $0.300 per million output tokens via OpenRouter, making it 60th cheapest of 298 paid models.
Llama 4 Scout scores 13.5 on the Artificial Analysis Intelligence Index, ranking 151st of 178 benchmarked models, with a GPQA Diamond score of 59%.
Llama 4 Scout generates around 130 tokens per second with 249ms time-to-first-token (p50), the 39th fastest tracked model.
Llama 4 Scout supports a 10M-token context window and can output up to 16K tokens. It accepts text, image input.