OpenAI: gpt-oss-120b wins on more metrics (8 of 9), but the right pick depends on what you optimize for — see the breakdown below.
| Metric | Inception: Mercury 2 | OpenAI: gpt-oss-120b |
|---|---|---|
| Intelligence Index | 32.8 | 33.3✓ |
| Coding Index | 30.6✓ | 28.6 |
| GPQA Diamond | 77% | 78%✓ |
| Design Arena Elo | — | 1062✓ |
| Speed (tokens/sec) | 476 | 554✓ |
| Latency | 253ms | 161ms✓ |
| Input price /M | $0.250 | $0.039✓ |
| Output price /M | $0.750 | $0.180✓ |
| Context window | 128K | 131K✓ |
| Capabilities | ReasoningToolsJSON | ReasoningToolsJSON |