modelgrep
M

Meta: Llama 3.2 1B Instruct

meta-llama/llama-3.2-1b-instruct

178th smartest of 178Cheaper than 98% of paid
Use via OpenRouter ↗
Intelligence
6.3
178th of 178
Design Elo
Speed
169
24th fastest
Latency
332ms
first token
Input price
$0.027
5th cheapest
Context
131K
60K max out

How it compares

Smarter than0%
of all ranked models
Faster than92%
of all ranked models
Cheaper than98%
of all ranked models

Overview

Llama 3.2 1B is a 1-billion-parameter language model focused on efficiently performing natural language tasks, such as summarization, dialogue, and multilingual text analysis. Its smaller size allows it to operate...

Benchmarks

independent · via OpenRouter
Artificial Analysis1th percentile
Intelligence Index
6.3
Coding Index
0.6
Agentic Index
0.0
GPQA Diamond
20%
Humanity's Last Exam
5%
SciCode
2%
Tau²-Bench (agentic)
0%

Providers & pricing (1)

ProviderIn $/MOut $/MUptime
Cloudflare$0.027$0.201

Specifications

Context window131K
Max output60K
Knowledge cutoffDec 2023
Input modalitiestext
Output modalitiestext
Prompt caching
Cache read price
ModeratedNo

Llama 3.2 1B Instruct FAQ

How much does Llama 3.2 1B Instruct cost?

Llama 3.2 1B Instruct costs $0.027 per million input tokens and $0.201 per million output tokens via OpenRouter, making it 5th cheapest of 298 paid models.

How smart is Llama 3.2 1B Instruct?

Llama 3.2 1B Instruct scores 6.3 on the Artificial Analysis Intelligence Index, ranking 178th of 178 benchmarked models, with a GPQA Diamond score of 20%.

How fast is Llama 3.2 1B Instruct?

Llama 3.2 1B Instruct generates around 169 tokens per second with 332ms time-to-first-token (p50), the 24th fastest tracked model.

What is Llama 3.2 1B Instruct's context window?

Llama 3.2 1B Instruct supports a 131K-token context window and can output up to 60K tokens. It accepts text input.

Compare head-to-head