Microsoft: Phi 4 wins on more metrics (5 of 8), but the right pick depends on what you optimize for — see the breakdown below.
| Metric | Microsoft: Phi 4 | Qwen: Qwen3 8B |
|---|---|---|
| Intelligence Index | 10.4 | 10.6✓ |
| Coding Index | 11.2✓ | 7.1 |
| GPQA Diamond | 57%✓ | 45% |
| Design Arena Elo | — | — |
| Speed (tokens/sec) | 65✓ | 30 |
| Latency | 214ms✓ | 656ms |
| Input price /M | $0.065 | $0.050✓ |
| Output price /M | $0.140✓ | $0.400 |
| Context window | 16K | 131K✓ |
| Capabilities | JSON | ReasoningToolsJSON |