modelgrep
M

Meta: Llama 3.2 3B Instruct

meta-llama/llama-3.2-3b-instruct

Cheaper than 93% of paid
Use via OpenRouter ↗
Intelligence
Design Elo
Speed
102
63rd fastest
Latency
223ms
first token
Input price
$0.051
22nd cheapest
Context
131K
80K max out

How it compares

Faster than79%
of all ranked models
Cheaper than93%
of all ranked models

Overview

Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for advanced natural language processing tasks like dialogue generation, reasoning, and summarization. Designed with the latest transformer architecture, it...

Benchmarks

independent · via OpenRouter
Artificial Analysis
GPQA Diamond
26%
Humanity's Last Exam
5%
SciCode
5%
Tau²-Bench (agentic)
21%

Providers & pricing (1)

ProviderIn $/MOut $/MUptime
Cloudflare$0.051$0.335100%

Specifications

Context window131K
Max output80K
Knowledge cutoffDec 2023
Input modalitiestext
Output modalitiestext
Prompt caching
Cache read price
ModeratedNo

Llama 3.2 3B Instruct FAQ

How much does Llama 3.2 3B Instruct cost?

Llama 3.2 3B Instruct costs $0.051 per million input tokens and $0.335 per million output tokens via OpenRouter, making it 22nd cheapest of 298 paid models.

How fast is Llama 3.2 3B Instruct?

Llama 3.2 3B Instruct generates around 102 tokens per second with 223ms time-to-first-token (p50), the 63rd fastest tracked model.

What is Llama 3.2 3B Instruct's context window?

Llama 3.2 3B Instruct supports a 131K-token context window and can output up to 80K tokens. It accepts text input.

Compare head-to-head