modelgrep

Best Z.ai Vision Models

Quick answer · Updated June 2026

GLM 4.6V is the best vision-capable Z.ai model, pairing 23.4 intelligence with image and document understanding. GLM 4.5V (15.1) is next.

23.4Intelligence
60 t/sSpeed
$0.300Input /M
131KContext

Multimodal large language models that accept image input, ranked by intelligence. The best vision-capable AI models for understanding images, documents and charts.

  1. 1Z
    glm-4.6v
    ReasoningToolsJSON+123.4 intel · $0.300/M · 60 t/s
    23.4
    Intelligence
  2. 2Z
    glm-4.5v
    ReasoningToolsJSON+115.1 intel · $0.600/M · 19 t/s
    15.1
    Intelligence

Frequently asked

What is the best Z.ai model for vision?

GLM 4.6V is the best vision-capable Z.ai model, pairing 23.4 intelligence with image and document understanding. GLM 4.5V (15.1) is next.

What's a good alternative to GLM 4.6V?

GLM 4.5V (15.1) is the closest alternative on this metric. See the full ranking above for the tradeoffs.

How many Z.ai models are there?

modelgrep tracks 10 Z.ai models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by GLM 5 Turbo. 2 of them qualify for this ranking.

More Z.ai rankings

All rankings