qwen/qwen3.5-35b-a3b
The Qwen3.5 Series 35B-A3B is a native vision-language model designed with a hybrid architecture that integrates linear attention mechanisms and a sparse mixture-of-experts model, achieving higher inference efficiency. Its overall...
| Provider | In $/M | Out $/M | Context | Uptime |
|---|---|---|---|---|
| DeepInfrafp8 | $0.140 | $1.00 | 262K | — |
| Parasailfp8 | $0.150 | $1.00 | 262K | 100% |
| AkashMLfp8 | $0.160 | $1.20 | 262K | 100% |
| Alibaba | $0.163 | $1.30 | 262K | 100% |
| AtlasCloudfp8 | $0.225 | $1.80 | 262K | — |
| SiliconFlowfp8 | $0.240 | $1.80 | 262K | — |
| WandBfp8 | $0.250 | $1.25 | 262K | — |
| NextBitfp8 | $0.300 | $1.80 | 262K | — |
| Venice | $0.313 | $1.25 | 256K | — |
Qwen3.5-35B-A3B costs $0.140 per million input tokens and $1.00 per million output tokens via OpenRouter, making it 73rd cheapest of 298 paid models.
Qwen3.5-35B-A3B scores 30.7 on the Artificial Analysis Intelligence Index, ranking 78th of 178 benchmarked models, with a GPQA Diamond score of 82%.
Qwen3.5-35B-A3B generates around 180 tokens per second with 333ms time-to-first-token (p50), the 17th fastest tracked model.
Qwen3.5-35B-A3B supports a 262K-token context window and can output up to 82K tokens. It accepts text, image, video input.