modelgrep
N

NVIDIA: Nemotron 3 Super

nvidia/nemotron-3-super-120b-a12b

58th smartest of 180Cheaper than 86% of paidReasoningToolsJSON
Use via OpenRouter ↗
Intelligence
36.0
58th of 180
Design Elo
Speed
240
15th fastest
Latency
1.2s
first token
Input price
$0.090
42nd cheapest
Context
1M

How it compares

Smarter than68%
of all ranked models
Faster than95%
of all ranked models
Cheaper than86%
of all ranked models

Overview

NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B parameters for maximum compute efficiency and accuracy in complex multi-agent applications. Built on a hybrid Mamba-Transformer...

Benchmarks

independent · via OpenRouter
Artificial Analysis65th percentile
Intelligence Index
36.0
Coding Index
31.2
Agentic Index
40.2
GPQA Diamond
80%
Humanity's Last Exam
19%
SciCode
36%
Tau²-Bench (agentic)
68%

Providers & pricing (4)

ProviderIn $/MOut $/MUptime
DekaLLMfp8$0.090$0.45099.1%
DeepInfrabf16$0.100$0.50098.2%
DigitalOcean$0.300$0.65099.8%
Nebiusfp4$0.300$0.900

Specifications

Context window1M
Max output
Knowledge cutoff
Input modalitiestext
Output modalitiestext
Prompt caching
Cache read price
ModeratedNo

Nemotron 3 Super FAQ

How much does Nemotron 3 Super cost?

Nemotron 3 Super costs $0.090 per million input tokens and $0.450 per million output tokens via OpenRouter, making it 42nd cheapest of 298 paid models.

How smart is Nemotron 3 Super?

Nemotron 3 Super scores 36.0 on the Artificial Analysis Intelligence Index, ranking 58th of 180 benchmarked models, with a GPQA Diamond score of 80%.

How fast is Nemotron 3 Super?

Nemotron 3 Super generates around 240 tokens per second with 1.2s time-to-first-token (p50), the 15th fastest tracked model.

What is Nemotron 3 Super's context window?

Nemotron 3 Super supports a 1M-token context window. It accepts text input.

Compare head-to-head