qwen/qwen3-vl-235b-a22b-instruct
Qwen3-VL-235B-A22B Instruct is an open-weight multimodal model that unifies strong text generation with visual understanding across images and video. The Instruct model targets general vision-language use (VQA, document parsing, chart/table...
| Provider | In $/M | Out $/M | Context | Uptime |
|---|---|---|---|---|
| DeepInfrafp8 | $0.200 | $0.880 | 262K | 99.9% |
| Parasailfp8 | $0.210 | $1.90 | 131K | 98.4% |
| Venicefp8 | $0.250 | $1.50 | 256K | 91.9% |
| Alibaba | $0.260 | $1.04 | 131K | 99.9% |
| AtlasCloudfp8 | $0.300 | $1.50 | 131K | 98.3% |
| Novitabf16 | $0.300 | $1.50 | 131K | 96% |
Qwen3 VL 235B A22B Instruct costs $0.200 per million input tokens and $0.880 per million output tokens via OpenRouter, making it 98th cheapest of 298 paid models.
Qwen3 VL 235B A22B Instruct scores 17.0 on the Artificial Analysis Intelligence Index, ranking 135th of 178 benchmarked models, with a GPQA Diamond score of 61%.
Qwen3 VL 235B A22B Instruct generates around 36 tokens per second with 618ms time-to-first-token (p50), the 216th fastest tracked model.
Qwen3 VL 235B A22B Instruct supports a 262K-token context window and can output up to 16K tokens. It accepts text, image input.