modelgrep
Q

Qwen: Qwen2.5 7B Instruct

qwen/qwen-2.5-7b-instruct

Cheaper than 96% of paid
Use via OpenRouter ↗
Intelligence
Design Elo
Speed
65
127th fastest
Latency
353ms
first token
Input price
$0.040
11th cheapest
Context
131K
33K max out

How it compares

Faster than57%
of all ranked models
Cheaper than96%
of all ranked models

Overview

Qwen2.5 7B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2: - Significantly more knowledge and has greatly improved capabilities in coding and...

Providers & pricing (2)

ProviderIn $/MOut $/MUptime
Phala$0.040$0.10092.1%
Togetherfp8$0.300$0.30099.9%

Specifications

Context window131K
Max output33K
Knowledge cutoffJun 2024
Input modalitiestext
Output modalitiestext
Prompt caching
Cache read price
ModeratedNo

Qwen2.5 7B Instruct FAQ

How much does Qwen2.5 7B Instruct cost?

Qwen2.5 7B Instruct costs $0.040 per million input tokens and $0.100 per million output tokens via OpenRouter, making it 11th cheapest of 298 paid models.

How fast is Qwen2.5 7B Instruct?

Qwen2.5 7B Instruct generates around 65 tokens per second with 353ms time-to-first-token (p50), the 127th fastest tracked model.

What is Qwen2.5 7B Instruct's context window?

Qwen2.5 7B Instruct supports a 131K-token context window and can output up to 33K tokens. It accepts text input.

Compare head-to-head