deepseek/deepseek-r1-distill-qwen-32b
DeepSeek R1 Distill Qwen 32B is a distilled large language model based on [Qwen 2.5 32B](https://huggingface.co/Qwen/Qwen2.5-32B), using outputs from [DeepSeek R1](/deepseek/deepseek-r1). It outperforms OpenAI's o1-mini across various benchmarks, achieving new...
| Provider | In $/M | Out $/M | Context | Uptime |
|---|---|---|---|---|
| NextBitfp8 | $0.290 | $0.290 | 33K | — |
R1 Distill Qwen 32B costs $0.290 per million input tokens and $0.290 per million output tokens via OpenRouter, making it 127th cheapest of 298 paid models.
R1 Distill Qwen 32B generates around 23 tokens per second with 970ms time-to-first-token (p50), the 270th fastest tracked model.
R1 Distill Qwen 32B supports a 128K-token context window and can output up to 33K tokens. It accepts text input.