modelgrep
Q

Qwen: Qwen3 8B

qwen/qwen3-8b

164th smartest of 178Cheaper than 94% of paidReasoningToolsJSON
Use via OpenRouter ↗
Intelligence
10.6
164th of 178
Design Elo
Speed
29
240th fastest
Latency
672ms
first token
Input price
$0.050
18th cheapest
Context
131K
8K max out

How it compares

Smarter than8%
of all ranked models
Faster than19%
of all ranked models
Cheaper than94%
of all ranked models

Overview

Qwen3-8B is a dense 8.2B parameter causal language model from the Qwen3 series, designed for both reasoning-heavy tasks and efficient dialogue. It supports seamless switching between "thinking" mode for math,...

Benchmarks

independent · via OpenRouter
Artificial Analysis9th percentile
Intelligence Index
10.6
Coding Index
7.1
Agentic Index
11.6
GPQA Diamond
45%
Humanity's Last Exam
3%
SciCode
17%
Tau²-Bench (agentic)
25%

Providers & pricing (2)

ProviderIn $/MOut $/MUptime
AtlasCloudfp8$0.050$0.40099.8%
Alibaba$0.117$0.45599.9%

Specifications

Context window131K
Max output8K
Knowledge cutoffMar 2025
Input modalitiestext
Output modalitiestext
Prompt caching
Cache read price$0.050/M
ModeratedNo
Open weightsQwen/Qwen3-8B

Qwen3 8B FAQ

How much does Qwen3 8B cost?

Qwen3 8B costs $0.050 per million input tokens and $0.400 per million output tokens via OpenRouter, making it 18th cheapest of 298 paid models.

How smart is Qwen3 8B?

Qwen3 8B scores 10.6 on the Artificial Analysis Intelligence Index, ranking 164th of 178 benchmarked models, with a GPQA Diamond score of 45%.

How fast is Qwen3 8B?

Qwen3 8B generates around 29 tokens per second with 672ms time-to-first-token (p50), the 240th fastest tracked model.

What is Qwen3 8B's context window?

Qwen3 8B supports a 131K-token context window and can output up to 8K tokens. It accepts text input.

Compare head-to-head