modelgrep
Q

Qwen: Qwen3 14B

qwen/qwen3-14b

136th smartest of 179Cheaper than 81% of paidReasoningToolsJSON
Use via OpenRouter ↗
Intelligence
16.2
136th of 179
Design Elo
Speed
66
114th fastest
Latency
349ms
first token
Input price
$0.100
58th cheapest
Context
132K
41K max out

How it compares

Smarter than24%
of all ranked models
Faster than62%
of all ranked models
Cheaper than81%
of all ranked models

Overview

Qwen3-14B is a dense 14.8B parameter causal language model from the Qwen3 series, designed for both complex reasoning and efficient dialogue. It supports seamless switching between a "thinking" mode for...

Benchmarks

independent · via OpenRouter
Artificial Analysis23th percentile
Intelligence Index
16.2
Coding Index
13.1
Agentic Index
14.4
GPQA Diamond
60%
Humanity's Last Exam
4%
SciCode
32%
Tau²-Bench (agentic)
35%

Providers & pricing (3)

ProviderIn $/MOut $/MUptime
NextBitint4$0.100$0.24096.6%
DeepInfrafp8$0.120$0.240100%
Alibaba$0.228$0.910

Specifications

Context window132K
Max output41K
Knowledge cutoffMar 2025
Input modalitiestext
Output modalitiestext
Prompt caching
Cache read price
ModeratedNo
Open weightsQwen/Qwen3-14B

Qwen3 14B FAQ

How much does Qwen3 14B cost?

Qwen3 14B costs $0.100 per million input tokens and $0.240 per million output tokens via OpenRouter, making it 58th cheapest of 298 paid models.

How smart is Qwen3 14B?

Qwen3 14B scores 16.2 on the Artificial Analysis Intelligence Index, ranking 136th of 179 benchmarked models, with a GPQA Diamond score of 60%.

How fast is Qwen3 14B?

Qwen3 14B generates around 66 tokens per second with 349ms time-to-first-token (p50), the 114th fastest tracked model.

What is Qwen3 14B's context window?

Qwen3 14B supports a 132K-token context window and can output up to 41K tokens. It accepts text input.

Compare head-to-head