Gemini 3 Flash Preview is the best vision-capable Google model, pairing 46.4 intelligence with image and document understanding. Gemini 3.5 Flash (43.3) and Gemini 3.1 Pro Preview (41.3) round out the top three.
Multimodal large language models that accept image input, ranked by intelligence. The best vision-capable AI models for understanding images, documents and charts.
Gemini 3 Flash Preview is the best vision-capable Google model, pairing 46.4 intelligence with image and document understanding. Gemini 3.5 Flash (43.3) and Gemini 3.1 Pro Preview (41.3) round out the top three.
Gemini 3.5 Flash (43.3) is the closest alternative on this metric, followed by Gemini 3.1 Pro Preview (41.3). See the full ranking above for the tradeoffs.
modelgrep tracks 26 Google models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by Gemini 3 Flash Preview. 24 of them qualify for this ranking.