deepseek/deepseek-v4-flash
DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-token context window. It is designed for fast inference and...
| Provider | In $/M | Out $/M | Context | Uptime |
|---|---|---|---|---|
| GMICloudfp8 | $0.098 | $0.196 | 1.0M | 97.8% |
| Baidufp8 | $0.098 | $0.197 | 1.0M | 99.9% |
| DeepInfrafp4 | $0.100 | $0.200 | 1.0M | 99.9% |
| Cloudflare | $0.100 | $0.200 | 384K | 99.8% |
| DigitalOcean | $0.105 | $0.210 | 262K | 97.3% |
| SiliconFlowfp8 | $0.130 | $0.280 | 1.0M | 99.7% |
| StreamLake | $0.133 | $0.266 | 1.0M | 99.3% |
| Alibaba | $0.134 | $0.268 | 1M | 99.9% |
| Morph | $0.139 | $0.278 | 1.0M | 99.6% |
| DeepSeekcache | $0.140 | $0.280 | 1.0M | 99.9% |
| Parasailfp8 | $0.140 | $0.280 | 1.0M | 99.4% |
| AtlasCloudfp8 | $0.140 | $0.280 | 1.0M | 99.7% |
| AkashMLfp8 | $0.140 | $0.280 | 1.0M | 99.5% |
| Fireworks | $0.140 | $0.280 | 1.0M | 97.7% |
| Novitafp8 | $0.140 | $0.280 | 1.0M | 99.9% |
| WandBfp8 | $0.140 | $0.280 | 1.0M | 99.9% |
| Venice | $0.170 | $0.350 | 1M | 97.4% |
| Io Netfp8 | $0.204 | $0.390 | 1.0M | 99.9% |
DeepSeek V4 Flash costs $0.098 per million input tokens and $0.196 per million output tokens via OpenRouter, making it 47th cheapest of 298 paid models.
DeepSeek V4 Flash scores 46.0 on the Artificial Analysis Intelligence Index, ranking 23rd of 178 benchmarked models, with a GPQA Diamond score of 87%.
DeepSeek V4 Flash generates around 85 tokens per second with 544ms time-to-first-token (p50), the 84th fastest tracked model.
DeepSeek V4 Flash supports a 1.0M-token context window. It accepts text input.