qwen/qwen3.5-397b-a17b
The Qwen3.5 series 397B-A17B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. It delivers...
| Provider | In $/M | Out $/M | Context | Uptime |
|---|---|---|---|---|
| Alibaba | $0.390 | $2.34 | 262K | 99.7% |
| Chutesfp8 | $0.450 | $3.00 | 262K | 95.9% |
| DeepInfrafp8 | $0.450 | $3.00 | 262K | 99.9% |
| Morph | $0.500 | $3.50 | 262K | 99% |
| Parasailfp8 | $0.500 | $3.60 | 262K | 99.8% |
| AtlasCloudfp8 | $0.550 | $3.50 | 262K | — |
| Phala | $0.550 | $3.50 | 262K | — |
| DigitalOcean | $0.550 | $3.50 | 131K | — |
| Novita | $0.600 | $3.60 | 262K | 99.2% |
| Together | $0.600 | $3.60 | 262K | 98.8% |
| StreamLake | $0.600 | $3.60 | 256K | — |
| GMICloudfp8 | $0.600 | $3.60 | 262K | — |
| Venice | $0.750 | $4.50 | 128K | — |
Qwen3.5 397B A17B costs $0.390 per million input tokens and $2.34 per million output tokens via OpenRouter, making it 144th cheapest of 298 paid models.
Qwen3.5 397B A17B scores 40.1 on the Artificial Analysis Intelligence Index, ranking 40th of 180 benchmarked models, with a GPQA Diamond score of 86%.
Qwen3.5 397B A17B generates around 134 tokens per second with 634ms time-to-first-token (p50), the 35th fastest tracked model.
Qwen3.5 397B A17B supports a 262K-token context window and can output up to 66K tokens. It accepts text, image, video input.