11models tracked · ranked by intelligence, speed & price
NVIDIA's smartest model is Nemotron 3 Ultra (free) (47.7 on the Intelligence Index), its fastest is Nemotron 3 Nano 30B A3B (free) at 180 tokens/sec, and its cheapest is Nemotron 3 Nano 30B A3B at $0.050 per million input tokens. All 11 NVIDIA models are compared below by intelligence, speed, latency, context and price.
| Model | Intel | Speed | Latency | In $/M | Context |
|---|---|---|---|---|---|
| nemotron-3-ultra-550b-a55b:free ReasoningTools | 47.7 | 112 | 746ms | Free | 1M |
| nemotron-3-ultra-550b-a55b ReasoningToolsJSON | 47.7 | 112 | 746ms | $0.500 | 1M |
| nemotron-3-super-120b-a12b:free ReasoningToolsJSON | 36.0 | 172 | 1.1s | Free | 1M |
| nemotron-3-super-120b-a12b ReasoningToolsJSON | 36.0 | 142 | 768ms | $0.090 | 1M |
| nemotron-3-nano-omni-30b-a3b-reasoning:free ReasoningToolsVision+1 | 21.4 | 173 | 720ms | Free | 256K |
| llama-3.3-nemotron-super-49b-v1.5 ReasoningToolsJSON | 18.7 | 49 | 168ms | $0.400 | 131K |
| nemotron-nano-12b-v2-vl:free ReasoningToolsVision | 14.9 | — | — | Free | 128K |
| nemotron-nano-9b-v2:free ReasoningToolsJSON | 14.8 | 49 | 1.3s | Free | 128K |
| nemotron-3-nano-30b-a3b:free ReasoningTools | 13.2 | 180 | 442ms | Free | 256K |
| nemotron-3-nano-30b-a3b ReasoningToolsJSON | 13.2 | 176 | 508ms | $0.050 | 262K |
| nemotron-3.5-content-safety:free ReasoningVision | — | 55 | 565ms | Free | 128K |