modelgrep
Q

Qwen: Qwen3.5-9B

qwen/qwen3.5-9b

68th smartest of 180Cheaper than 84% of paidReasoningToolsJSONVision
Use via OpenRouter ↗
Intelligence
32.4
68th of 180
Design Elo
Speed
88
78th fastest
Latency
568ms
first token
Input price
$0.100
49th cheapest
Context
262K
262K max out

How it compares

Smarter than62%
of all ranked models
Faster than75%
of all ranked models
Cheaper than84%
of all ranked models

Overview

Qwen3.5-9B is a multimodal foundation model from the Qwen3.5 family, designed to deliver strong reasoning, coding, and visual understanding in an efficient 9B-parameter architecture. It uses a unified vision-language design...

Benchmarks

independent · via OpenRouter
Artificial Analysis58th percentile
Intelligence Index
32.4
Coding Index
25.3
Agentic Index
37.4
GPQA Diamond
81%
Humanity's Last Exam
13%
SciCode
28%
Tau²-Bench (agentic)
87%

Providers & pricing (4)

ProviderIn $/MOut $/MUptime
SiliconFlowfp8$0.100$0.15098.3%
DeepInfrabf16$0.100$0.15099.7%
Venicefp8$0.100$0.15099.8%
Together$0.170$0.25099.6%

Specifications

Context window262K
Max output262K
Knowledge cutoff
Input modalitiestext, image, video
Output modalitiestext
Prompt caching
Cache read price
ModeratedNo

Qwen3.5-9B FAQ

How much does Qwen3.5-9B cost?

Qwen3.5-9B costs $0.100 per million input tokens and $0.150 per million output tokens via OpenRouter, making it 49th cheapest of 298 paid models.

How smart is Qwen3.5-9B?

Qwen3.5-9B scores 32.4 on the Artificial Analysis Intelligence Index, ranking 68th of 180 benchmarked models, with a GPQA Diamond score of 81%.

How fast is Qwen3.5-9B?

Qwen3.5-9B generates around 88 tokens per second with 568ms time-to-first-token (p50), the 78th fastest tracked model.

What is Qwen3.5-9B's context window?

Qwen3.5-9B supports a 262K-token context window and can output up to 262K tokens. It accepts text, image, video input.

Compare head-to-head