modelgrep
N

NVIDIA: Nemotron 3 Nano 30B A3B

nvidia/nemotron-3-nano-30b-a3b

154th smartest of 178Cheaper than 95% of paidReasoningToolsJSON
Use via OpenRouter ↗
Intelligence
13.2
154th of 178
Design Elo
Speed
176
20th fastest
Latency
508ms
first token
Input price
$0.050
16th cheapest
Context
262K
228K max out

How it compares

Smarter than13%
of all ranked models
Faster than93%
of all ranked models
Cheaper than95%
of all ranked models

Overview

NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficiency and accuracy for developers to build specialized agentic AI systems. The model is fully...

Benchmarks

independent · via OpenRouter
Artificial Analysis14th percentile
Intelligence Index
13.2
Coding Index
15.8
Agentic Index
8.5
GPQA Diamond
40%
Humanity's Last Exam
5%
SciCode
23%
Tau²-Bench (agentic)
25%

Providers & pricing (2)

ProviderIn $/MOut $/MUptime
DeepInfrafp4$0.050$0.20094.5%
Nebiusfp8$0.060$0.24084.5%

Specifications

Context window262K
Max output228K
Knowledge cutoff
Input modalitiestext
Output modalitiestext
Prompt caching
Cache read price
ModeratedNo

Nemotron 3 Nano 30B A3B FAQ

How much does Nemotron 3 Nano 30B A3B cost?

Nemotron 3 Nano 30B A3B costs $0.050 per million input tokens and $0.200 per million output tokens via OpenRouter, making it 16th cheapest of 298 paid models.

How smart is Nemotron 3 Nano 30B A3B?

Nemotron 3 Nano 30B A3B scores 13.2 on the Artificial Analysis Intelligence Index, ranking 154th of 178 benchmarked models, with a GPQA Diamond score of 40%.

How fast is Nemotron 3 Nano 30B A3B?

Nemotron 3 Nano 30B A3B generates around 176 tokens per second with 508ms time-to-first-token (p50), the 20th fastest tracked model.

What is Nemotron 3 Nano 30B A3B's context window?

Nemotron 3 Nano 30B A3B supports a 262K-token context window and can output up to 228K tokens. It accepts text input.

Compare head-to-head