qwen/qwen3-32b
Qwen3-32B is a dense 32.8B parameter causal language model from the Qwen3 series, optimized for both complex reasoning and efficient dialogue. It supports seamless switching between a "thinking" mode for...
| Provider | In $/M | Out $/M | Context | Uptime |
|---|---|---|---|---|
| DeepInfrafp8 | $0.080 | $0.280 | 41K | 99.9% |
| Nebiusfp8 | $0.100 | $0.300 | 41K | 99.6% |
| AtlasCloudfp8 | $0.100 | $1.20 | 41K | 99.4% |
| Alibaba | $0.104 | $0.416 | 131K | 99.3% |
| SiliconFlowfp8 | $0.140 | $0.570 | 131K | 98.5% |
| Groq | $0.290 | $0.590 | 131K | 99.7% |
Qwen3 32B costs $0.080 per million input tokens and $0.280 per million output tokens via OpenRouter, making it 40th cheapest of 298 paid models.
Qwen3 32B generates around 361 tokens per second with 322ms time-to-first-token (p50), the 9th fastest tracked model.
Qwen3 32B supports a 131K-token context window and can output up to 16K tokens. It accepts text input.