qwen/qwen3-235b-a22b-2507
Qwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-experts language model based on the Qwen3-235B architecture, with 22B active parameters per forward pass. It is optimized for general-purpose text generation, including instruction following,...
| Provider | In $/M | Out $/M | Context | Uptime |
|---|---|---|---|---|
| DeepInfrafp8 | $0.090 | $0.100 | 262K | 99.4% |
| Novitafp8 | $0.090 | $0.580 | 131K | 99.9% |
| WandBbf16 | $0.100 | $0.100 | 262K | 100% |
| Parasailfp8 | $0.140 | $0.800 | 131K | 99.7% |
| Alibaba | $0.149 | $0.598 | 131K | 100% |
| Together | $0.200 | $0.600 | 262K | 93.5% |
| Friendli | $0.200 | $0.800 | 262K | 97.7% |
| AtlasCloudfp8 | $0.200 | $0.880 | 131K | 100% |
| StreamLake | $0.210 | $0.840 | 128K | 100% |
| $0.220 | $0.880 | 262K | 100% | |
| $0.250 | $1.00 | 262K | 100% |
Qwen3 235B A22B Instruct 2507 costs $0.090 per million input tokens and $0.100 per million output tokens via OpenRouter, making it 45th cheapest of 298 paid models.
Qwen3 235B A22B Instruct 2507 scores 25.0 on the Artificial Analysis Intelligence Index, ranking 95th of 180 benchmarked models, with a GPQA Diamond score of 75%.
Qwen3 235B A22B Instruct 2507 generates around 82 tokens per second with 291ms time-to-first-token (p50), the 96th fastest tracked model.
Qwen3 235B A22B Instruct 2507 supports a 262K-token context window and can output up to 16K tokens. It accepts text input.