inception/mercury-2
Mercury 2 is an extremely fast reasoning LLM, and the first reasoning diffusion LLM (dLLM). Instead of generating tokens sequentially, Mercury 2 produces and refines multiple tokens in parallel, achieving...
| Provider | In $/M | Out $/M | Context | Uptime |
|---|---|---|---|---|
| Inception | $0.250 | $0.750 | 128K | 99.8% |
Mercury 2 costs $0.250 per million input tokens and $0.750 per million output tokens via OpenRouter, making it 110th cheapest of 298 paid models.
Mercury 2 scores 32.8 on the Artificial Analysis Intelligence Index, ranking 69th of 178 benchmarked models, with a GPQA Diamond score of 77%.
Mercury 2 generates around 386 tokens per second with 254ms time-to-first-token (p50), the 8th fastest tracked model.
Mercury 2 supports a 128K-token context window and can output up to 50K tokens. It accepts text input.