modelgrep
Q

Qwen: Qwen3 VL 30B A3B Thinking

qwen/qwen3-vl-30b-a3b-thinking

120th smartest of 181Cheaper than 77% of paidReasoningToolsJSONVision
Use via OpenRouter ↗
Intelligence
19.7
120th of 181
Design Elo
Speed
69
116th fastest
Latency
457ms
first token
Input price
$0.130
69th cheapest
Context
131K
33K max out

How it compares

Smarter than34%
of all ranked models
Faster than61%
of all ranked models
Cheaper than77%
of all ranked models

Overview

Qwen3-VL-30B-A3B-Thinking is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Thinking variant enhances reasoning in STEM, math, and complex tasks. It excels...

Benchmarks

independent · via OpenRouter
Artificial Analysis32th percentile
Intelligence Index
19.7
Coding Index
13.1
Agentic Index
14.9
GPQA Diamond
72%
Humanity's Last Exam
9%
SciCode
29%
Tau²-Bench (agentic)
20%

Providers & pricing (3)

ProviderIn $/MOut $/MUptime
Alibaba$0.130$1.56100%
Novitafp16$0.200$1.00100%
SiliconFlowfp8$0.290$1.00100%

Specifications

Context window131K
Max output33K
Knowledge cutoffMar 2025
Input modalitiestext, image
Output modalitiestext
Prompt caching
Cache read price
ModeratedNo

Qwen3 VL 30B A3B Thinking FAQ

How much does Qwen3 VL 30B A3B Thinking cost?

Qwen3 VL 30B A3B Thinking costs $0.130 per million input tokens and $1.56 per million output tokens via OpenRouter, making it 69th cheapest of 298 paid models.

How smart is Qwen3 VL 30B A3B Thinking?

Qwen3 VL 30B A3B Thinking scores 19.7 on the Artificial Analysis Intelligence Index, ranking 120th of 181 benchmarked models, with a GPQA Diamond score of 72%.

How fast is Qwen3 VL 30B A3B Thinking?

Qwen3 VL 30B A3B Thinking generates around 69 tokens per second with 457ms time-to-first-token (p50), the 116th fastest tracked model.

What is Qwen3 VL 30B A3B Thinking's context window?

Qwen3 VL 30B A3B Thinking supports a 131K-token context window and can output up to 33K tokens. It accepts text, image input.

Compare head-to-head