GLM 4.6V is the best vision-capable Z.ai model, pairing 23.4 intelligence with image and document understanding. GLM 4.5V (15.1) is next.
Multimodal large language models that accept image input, ranked by intelligence. The best vision-capable AI models for understanding images, documents and charts.
GLM 4.6V is the best vision-capable Z.ai model, pairing 23.4 intelligence with image and document understanding. GLM 4.5V (15.1) is next.
GLM 4.5V (15.1) is the closest alternative on this metric. See the full ranking above for the tradeoffs.
modelgrep tracks 10 Z.ai models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by GLM 5 Turbo. 2 of them qualify for this ranking.