modelgrep

Best NVIDIA Vision Models

Quick answer · Updated June 2026

Nemotron 3 Nano Omni (free) is the best vision-capable NVIDIA model, pairing 21.4 intelligence with image and document understanding. Nemotron Nano 12B 2 VL (free) (14.9) and Nemotron 3.5 Content Safety (free) (—) round out the top three.

21.4Intelligence
181 t/sSpeed
FreeInput /M
256KContext

Multimodal large language models that accept image input, ranked by intelligence. The best vision-capable AI models for understanding images, documents and charts.

  1. 1N
    nemotron-3-nano-omni-30b-a3b-reasoning:free
    ReasoningToolsVision+121.4 intel · Free/M · 181 t/s
    21.4
    Intelligence
  2. 2N
    nemotron-nano-12b-v2-vl:free
    ReasoningToolsVision14.9 intel · Free/M · 27 t/s
    14.9
    Intelligence
  3. 3N
    nemotron-3.5-content-safety:free
    ReasoningVisionFree/M · 80 t/s · 246ms ttft
    Intelligence

Frequently asked

What is the best NVIDIA model for vision?

Nemotron 3 Nano Omni (free) is the best vision-capable NVIDIA model, pairing 21.4 intelligence with image and document understanding. Nemotron Nano 12B 2 VL (free) (14.9) and Nemotron 3.5 Content Safety (free) (—) round out the top three.

What's a good alternative to Nemotron 3 Nano Omni (free)?

Nemotron Nano 12B 2 VL (free) (14.9) is the closest alternative on this metric, followed by Nemotron 3.5 Content Safety (free) (—). See the full ranking above for the tradeoffs.

How many NVIDIA models are there?

modelgrep tracks 11 NVIDIA models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by Nemotron 3 Ultra (free). 3 of them qualify for this ranking.

More NVIDIA rankings

All rankings