modelgrep
M

Meta: Llama 4 Scout

meta-llama/llama-4-scout

151st smartest of 178Cheaper than 80% of paidToolsJSONVision
Use via OpenRouter ↗
Intelligence
13.5
151st of 178
Design Elo
940
Data Viz
Speed
130
39th fastest
Latency
249ms
first token
Input price
$0.100
60th cheapest
Context
10M
16K max out

How it compares

Smarter than15%
of all ranked models
Faster than87%
of all ranked models
Cheaper than80%
of all ranked models

Overview

Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model developed by Meta, activating 17 billion parameters out of a total of 109B. It supports native multimodal input...

Benchmarks

independent · via OpenRouter
Artificial Analysis16th percentile
Intelligence Index
13.5
Coding Index
6.7
Agentic Index
5.2
GPQA Diamond
59%
Humanity's Last Exam
4%
SciCode
17%
Tau²-Bench (agentic)
16%
Design Arena · Elo615 tournaments
Data Viz
940
codecategories
842
Game Dev
842
UI Component
825
Website
795

Providers & pricing (4)

ProviderIn $/MOut $/MUptime
DeepInfrafp8$0.100$0.300100%
Groq$0.110$0.34099.8%
Novitabf16$0.180$0.590100%
Google$0.250$0.700100%

Specifications

Context window10M
Max output16K
Knowledge cutoffAug 2024
Input modalitiestext, image
Output modalitiestext
Prompt caching
Cache read price
ModeratedNo

Llama 4 Scout FAQ

How much does Llama 4 Scout cost?

Llama 4 Scout costs $0.100 per million input tokens and $0.300 per million output tokens via OpenRouter, making it 60th cheapest of 298 paid models.

How smart is Llama 4 Scout?

Llama 4 Scout scores 13.5 on the Artificial Analysis Intelligence Index, ranking 151st of 178 benchmarked models, with a GPQA Diamond score of 59%.

How fast is Llama 4 Scout?

Llama 4 Scout generates around 130 tokens per second with 249ms time-to-first-token (p50), the 39th fastest tracked model.

What is Llama 4 Scout's context window?

Llama 4 Scout supports a 10M-token context window and can output up to 16K tokens. It accepts text, image input.

Compare head-to-head