modelgrep

Cheapest NVIDIA Models

Quick answer · Updated June 2026

The cheapest NVIDIA model is Nemotron 3 Nano 30B A3B at $0.050 per million input tokens. Nemotron 3 Super ($0.090) and Llama 3.3 Nemotron Super 49B V1.5 ($0.400) round out the top three.

$0.050Input /M
13.2Intelligence
159 t/sSpeed
262KContext

AI models ranked by input token price. The most affordable large language model APIs, from budget open-weight models to discounted frontier models.

  1. 1N
    nemotron-3-nano-30b-a3b
    ReasoningToolsJSON13.2 intel · 159 t/s · 651ms ttft
    $0.050
    Input /M
  2. 2N
    nemotron-3-super-120b-a12b
    ReasoningToolsJSON36.0 intel · 238 t/s · 739ms ttft
    $0.090
    Input /M
  3. 3N
    llama-3.3-nemotron-super-49b-v1.5
    ReasoningToolsJSON14.6 intel · 44 t/s · 170ms ttft
    $0.400
    Input /M
  4. 4N
    nemotron-3-ultra-550b-a55b
    ReasoningToolsJSON47.7 intel · 84 t/s · 720ms ttft
    $0.500
    Input /M

Frequently asked

What is the cheapest NVIDIA model?

The cheapest NVIDIA model is Nemotron 3 Nano 30B A3B at $0.050 per million input tokens. Nemotron 3 Super ($0.090) and Llama 3.3 Nemotron Super 49B V1.5 ($0.400) round out the top three.

What's a good alternative to Nemotron 3 Nano 30B A3B?

Nemotron 3 Super ($0.090) is the closest alternative on this metric, followed by Llama 3.3 Nemotron Super 49B V1.5 ($0.400). See the full ranking above for the tradeoffs.

How many NVIDIA models are there?

modelgrep tracks 11 NVIDIA models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by Nemotron 3 Ultra (free). 4 of them qualify for this ranking.

More NVIDIA rankings

All rankings