Fastest Z.ai Models

Quick answer · Updated June 2026

The fastest Z.ai model is GLM 4.7 at 452 output tokens per second. GLM 5 (108 t/s) and GLM 5.1 (84 t/s) round out the top three.

452 t/sSpeed

42.1Intelligence

$0.400Input /M

203KContext

AI models ranked by output speed (tokens per second, p50). The fastest large language models for low-latency and high-throughput applications.

Frequently asked

What is the fastest Z.ai model?

The fastest Z.ai model is GLM 4.7 at 452 output tokens per second. GLM 5 (108 t/s) and GLM 5.1 (84 t/s) round out the top three.

What's a good alternative to GLM 4.7?

GLM 5 (108 t/s) is the closest alternative on this metric, followed by GLM 5.1 (84 t/s). See the full ranking above for the tradeoffs.

How many Z.ai models are there?

modelgrep tracks 10 Z.ai models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by GLM 5 Turbo. 10 of them qualify for this ranking.

More Z.ai rankings

Z.ai: Smartest LLMs Z.ai: Best LLMs for Coding Z.ai: Best LLMs for Design & Frontend Z.ai: Lowest-Latency LLMs Z.ai: Cheapest LLMs Z.ai: Best Free LLMs Z.ai: Best Reasoning LLMs Z.ai: Best Vision LLMs Z.ai: Best LLMs for Agents Z.ai: Best Open-Source LLMs Z.ai: Longest-Context LLMs

All rankings

Small & Fast LLMs Smartest LLMs Best LLMs for Coding Best LLMs for Design & Frontend Lowest-Latency LLMs Cheapest LLMs Best Free LLMs Best Reasoning LLMs Best Vision LLMs Best LLMs for Agents Best Open-Source LLMs Longest-Context LLMs