meta-llama/llama-4-maverick
Llama 4 Maverick 17B Instruct (128E) is a high-capacity multimodal language model from Meta, built on a mixture-of-experts (MoE) architecture with 128 experts and 17 billion active parameters per forward...
| Provider | In $/M | Out $/M | Context | Uptime |
|---|---|---|---|---|
| DeepInfrafp8 | $0.150 | $0.600 | 1.0M | 100% |
| Novitafp8 | $0.270 | $0.850 | 1.0M | 100% |
| Parasailfp8 | $0.350 | $1.00 | 524K | 100% |
| $0.350 | $1.15 | 524K | 99.9% |
Llama 4 Maverick costs $0.150 per million input tokens and $0.600 per million output tokens via OpenRouter, making it 84th cheapest of 298 paid models.
Llama 4 Maverick scores 18.4 on the Artificial Analysis Intelligence Index, ranking 130th of 178 benchmarked models, with a GPQA Diamond score of 67%.
Llama 4 Maverick generates around 72 tokens per second with 303ms time-to-first-token (p50), the 107th fastest tracked model.
Llama 4 Maverick supports a 1.0M-token context window and can output up to 16K tokens. It accepts text, image input.