modelgrep
Z

Z.ai: GLM 4.6

z-ai/glm-4.6

78th smartest of 180ReasoningToolsJSON
Use via OpenRouter ↗
Intelligence
30.2
78th of 180
Design Elo
1219
Game Dev
Speed
39
207th fastest
Latency
585ms
first token
Input price
$0.430
155th cheapest
Context
203K
131K max out

How it compares

Smarter than57%
of all ranked models
Faster than33%
of all ranked models
Cheaper than48%
of all ranked models

Overview

Compared with GLM-4.5, this generation brings several key improvements: Longer context window: The context window has been expanded from 128K to 200K tokens, enabling the model to handle more complex...

Benchmarks

independent · via OpenRouter
Artificial Analysis54th percentile
Intelligence Index
30.2
Coding Index
30.2
Agentic Index
42.9
GPQA Diamond
63%
Humanity's Last Exam
5%
SciCode
33%
Tau²-Bench (agentic)
77%
Design Arena · Elo14,641 tournaments
Game Dev
1219
Website
1219
godotgamedev
1218
codecategories
1218
UI Component
1215
3D
1209
Data Viz
1209
mobileapps
1195
svg
1166
fullstack
1109
androidnative
1103

Providers & pricing (5)

ProviderIn $/MOut $/MUptime
DeepInfrafp4$0.430$1.7499.5%
Novitabf16$0.550$2.20100%
Z.AIfp4$0.600$2.2087.7%
AtlasCloudfp8$0.600$2.20100%
Venicefp4$0.850$2.75

Specifications

Context window203K
Max output131K
Knowledge cutoffMar 2025
Input modalitiestext
Output modalitiestext
Prompt caching
Cache read price$0.080/M
ModeratedNo

GLM 4.6 FAQ

How much does GLM 4.6 cost?

GLM 4.6 costs $0.430 per million input tokens and $1.74 per million output tokens via OpenRouter, making it 155th cheapest of 298 paid models.

How smart is GLM 4.6?

GLM 4.6 scores 30.2 on the Artificial Analysis Intelligence Index, ranking 78th of 180 benchmarked models, with a GPQA Diamond score of 63%.

How fast is GLM 4.6?

GLM 4.6 generates around 39 tokens per second with 585ms time-to-first-token (p50), the 207th fastest tracked model.

What is GLM 4.6's context window?

GLM 4.6 supports a 203K-token context window and can output up to 131K tokens. It accepts text input.

Compare head-to-head