Qwen3.7 Plus is the best vision-capable Qwen model, pairing 53.3 intelligence with image and document understanding. Qwen3.6 Plus (50.0) and Qwen3.5-122B-A10B (41.6) round out the top three.
Multimodal large language models that accept image input, ranked by intelligence. The best vision-capable AI models for understanding images, documents and charts.
Qwen3.7 Plus is the best vision-capable Qwen model, pairing 53.3 intelligence with image and document understanding. Qwen3.6 Plus (50.0) and Qwen3.5-122B-A10B (41.6) round out the top three.
Qwen3.6 Plus (50.0) is the closest alternative on this metric, followed by Qwen3.5-122B-A10B (41.6). See the full ranking above for the tradeoffs.
modelgrep tracks 49 Qwen models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by Qwen3.7 Max. 21 of them qualify for this ranking.