modelgrep
M

Meta: Llama 4 Maverick

meta-llama/llama-4-maverick

130th smartest of 178ToolsJSONVision
Use via OpenRouter ↗
Intelligence
18.4
130th of 178
Design Elo
980
3D
Speed
72
107th fastest
Latency
303ms
first token
Input price
$0.150
84th cheapest
Context
1.0M
16K max out

How it compares

Smarter than27%
of all ranked models
Faster than64%
of all ranked models
Cheaper than72%
of all ranked models

Overview

Llama 4 Maverick 17B Instruct (128E) is a high-capacity multimodal language model from Meta, built on a mixture-of-experts (MoE) architecture with 128 experts and 17 billion active parameters per forward...

Benchmarks

independent · via OpenRouter
Artificial Analysis27th percentile
Intelligence Index
18.4
Coding Index
15.6
Agentic Index
7.2
GPQA Diamond
67%
Humanity's Last Exam
5%
SciCode
33%
Tau²-Bench (agentic)
18%
Design Arena · Elo790 tournaments
3D
980
UI Component
956
codecategories
932
Data Viz
926
Website
916
Game Dev
907

Providers & pricing (4)

ProviderIn $/MOut $/MUptime
DeepInfrafp8$0.150$0.600100%
Novitafp8$0.270$0.850100%
Parasailfp8$0.350$1.00100%
Google$0.350$1.1599.9%

Specifications

Context window1.0M
Max output16K
Knowledge cutoffAug 2024
Input modalitiestext, image
Output modalitiestext
Prompt caching
Cache read price
ModeratedNo

Llama 4 Maverick FAQ

How much does Llama 4 Maverick cost?

Llama 4 Maverick costs $0.150 per million input tokens and $0.600 per million output tokens via OpenRouter, making it 84th cheapest of 298 paid models.

How smart is Llama 4 Maverick?

Llama 4 Maverick scores 18.4 on the Artificial Analysis Intelligence Index, ranking 130th of 178 benchmarked models, with a GPQA Diamond score of 67%.

How fast is Llama 4 Maverick?

Llama 4 Maverick generates around 72 tokens per second with 303ms time-to-first-token (p50), the 107th fastest tracked model.

What is Llama 4 Maverick's context window?

Llama 4 Maverick supports a 1.0M-token context window and can output up to 16K tokens. It accepts text, image input.

Compare head-to-head