modelgrep
Q

Qwen2.5 72B Instruct

qwen/qwen-2.5-72b-instruct

ToolsJSON
Use via OpenRouter ↗
Intelligence
Design Elo
Speed
28
246th fastest
Latency
436ms
first token
Input price
$0.360
142nd cheapest
Context
131K
16K max out

How it compares

Faster than17%
of all ranked models
Cheaper than52%
of all ranked models

Overview

Qwen2.5 72B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2: - Significantly more knowledge and has greatly improved capabilities in coding and...

Benchmarks

independent · via OpenRouter
Artificial Analysis
Coding Index
11.9
GPQA Diamond
49%
Humanity's Last Exam
4%
SciCode
27%
Tau²-Bench (agentic)
35%

Providers & pricing (2)

ProviderIn $/MOut $/MUptime
DeepInfrafp8$0.360$0.400100%
Novitabf16$0.380$0.400

Specifications

Context window131K
Max output16K
Knowledge cutoffJun 2024
Input modalitiestext
Output modalitiestext
Prompt caching
Cache read price
ModeratedNo

Qwen2.5 72B Instruct FAQ

How much does Qwen2.5 72B Instruct cost?

Qwen2.5 72B Instruct costs $0.360 per million input tokens and $0.400 per million output tokens via OpenRouter, making it 142nd cheapest of 298 paid models.

How fast is Qwen2.5 72B Instruct?

Qwen2.5 72B Instruct generates around 28 tokens per second with 436ms time-to-first-token (p50), the 246th fastest tracked model.

What is Qwen2.5 72B Instruct's context window?

Qwen2.5 72B Instruct supports a 131K-token context window and can output up to 16K tokens. It accepts text input.

Compare head-to-head