Nemotron 3 Nano Omni (free) is the best vision-capable NVIDIA model, pairing 21.4 intelligence with image and document understanding. Nemotron Nano 12B 2 VL (free) (14.9) and Nemotron 3.5 Content Safety (free) (—) round out the top three.
Multimodal large language models that accept image input, ranked by intelligence. The best vision-capable AI models for understanding images, documents and charts.
Nemotron 3 Nano Omni (free) is the best vision-capable NVIDIA model, pairing 21.4 intelligence with image and document understanding. Nemotron Nano 12B 2 VL (free) (14.9) and Nemotron 3.5 Content Safety (free) (—) round out the top three.
Nemotron Nano 12B 2 VL (free) (14.9) is the closest alternative on this metric, followed by Nemotron 3.5 Content Safety (free) (—). See the full ranking above for the tradeoffs.
modelgrep tracks 11 NVIDIA models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by Nemotron 3 Ultra (free). 3 of them qualify for this ranking.