modelgrep
Q

Qwen: Qwen3.5-Flash

qwen/qwen3.5-flash-02-23

Cheaper than 90% of paidReasoningToolsJSONVision
Use via OpenRouter ↗
Intelligence
Design Elo
Speed
85
83rd fastest
Latency
626ms
first token
Input price
$0.065
29th cheapest
Context
1M
66K max out

How it compares

Faster than73%
of all ranked models
Cheaper than90%
of all ranked models

Overview

The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. Compared to the...

Providers & pricing (1)

ProviderIn $/MOut $/MUptime
Alibaba$0.065$0.260100%

Specifications

Context window1M
Max output66K
Knowledge cutoff
Input modalitiestext, image, video
Output modalitiestext
Prompt caching
Cache read price
ModeratedNo

Qwen3.5-Flash FAQ

How much does Qwen3.5-Flash cost?

Qwen3.5-Flash costs $0.065 per million input tokens and $0.260 per million output tokens via OpenRouter, making it 29th cheapest of 298 paid models.

How fast is Qwen3.5-Flash?

Qwen3.5-Flash generates around 85 tokens per second with 626ms time-to-first-token (p50), the 83rd fastest tracked model.

What is Qwen3.5-Flash's context window?

Qwen3.5-Flash supports a 1M-token context window and can output up to 66K tokens. It accepts text, image, video input.

Compare head-to-head