modelgrep
Q

Qwen: Qwen3 VL 8B Instruct

qwen/qwen3-vl-8b-instruct

150th smartest of 178Cheaper than 87% of paidToolsJSONVision
Use via OpenRouter ↗
Intelligence
14.3
150th of 178
Design Elo
Speed
54
154th fastest
Latency
462ms
first token
Input price
$0.080
38th cheapest
Context
256K
33K max out

How it compares

Smarter than16%
of all ranked models
Faster than48%
of all ranked models
Cheaper than87%
of all ranked models

Overview

Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built for high-fidelity understanding and reasoning across text, images, and video. It features improved multimodal fusion with Interleaved-MRoPE for long-horizon...

Benchmarks

independent · via OpenRouter
Artificial Analysis17th percentile
Intelligence Index
14.3
Coding Index
7.3
Agentic Index
18.4
GPQA Diamond
43%
Humanity's Last Exam
3%
SciCode
17%
Tau²-Bench (agentic)
29%

Providers & pricing (4)

ProviderIn $/MOut $/MUptime
Novitafp8$0.080$0.50098.2%
AtlasCloudfp8$0.080$0.50098.8%
Alibaba$0.117$0.45599.7%
Parasailbf16$0.250$0.75099.7%

Specifications

Context window256K
Max output33K
Knowledge cutoff
Input modalitiesimage, text
Output modalitiestext
Prompt caching
Cache read price
ModeratedNo

Qwen3 VL 8B Instruct FAQ

How much does Qwen3 VL 8B Instruct cost?

Qwen3 VL 8B Instruct costs $0.080 per million input tokens and $0.500 per million output tokens via OpenRouter, making it 38th cheapest of 298 paid models.

How smart is Qwen3 VL 8B Instruct?

Qwen3 VL 8B Instruct scores 14.3 on the Artificial Analysis Intelligence Index, ranking 150th of 178 benchmarked models, with a GPQA Diamond score of 43%.

How fast is Qwen3 VL 8B Instruct?

Qwen3 VL 8B Instruct generates around 54 tokens per second with 462ms time-to-first-token (p50), the 154th fastest tracked model.

What is Qwen3 VL 8B Instruct's context window?

Qwen3 VL 8B Instruct supports a 256K-token context window and can output up to 33K tokens. It accepts image, text input.

Compare head-to-head