modelgrep
Z

Z.ai: GLM 4.7 Flash

z-ai/glm-4.7-flash

81st smartest of 178Cheaper than 92% of paidReasoningToolsJSON
Use via OpenRouter ↗
Intelligence
30.1
81st of 178
Design Elo
1265
UI Component
Speed
37
210th fastest
Latency
333ms
first token
Input price
$0.060
24th cheapest
Context
203K
16K max out

How it compares

Smarter than54%
of all ranked models
Faster than29%
of all ranked models
Cheaper than92%
of all ranked models

Overview

As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances performance and efficiency. It is further optimized for agentic coding use cases, strengthening coding capabilities, long-horizon task planning,...

Benchmarks

independent · via OpenRouter
Artificial Analysis53th percentile
Intelligence Index
30.1
Coding Index
25.9
Agentic Index
46.0
GPQA Diamond
58%
Humanity's Last Exam
7%
SciCode
34%
Tau²-Bench (agentic)
99%
Design Arena · Elo5,798 tournaments
UI Component
1265
Website
1239
codecategories
1229
Game Dev
1204
3D
1201
Data Viz
1168
svg
1095

Providers & pricing (5)

ProviderIn $/MOut $/MUptime
DeepInfrabf16$0.060$0.40095.9%
Cloudflare$0.060$0.40099.7%
Novitabf16$0.070$0.40071.2%
Phala$0.100$0.43078.6%
Venicefp8$0.125$0.500

Specifications

Context window203K
Max output16K
Knowledge cutoff
Input modalitiestext
Output modalitiestext
Prompt caching
Cache read price$0.010/M
ModeratedNo

GLM 4.7 Flash FAQ

How much does GLM 4.7 Flash cost?

GLM 4.7 Flash costs $0.060 per million input tokens and $0.400 per million output tokens via OpenRouter, making it 24th cheapest of 298 paid models.

How smart is GLM 4.7 Flash?

GLM 4.7 Flash scores 30.1 on the Artificial Analysis Intelligence Index, ranking 81st of 178 benchmarked models, with a GPQA Diamond score of 58%.

How fast is GLM 4.7 Flash?

GLM 4.7 Flash generates around 37 tokens per second with 333ms time-to-first-token (p50), the 210th fastest tracked model.

What is GLM 4.7 Flash's context window?

GLM 4.7 Flash supports a 203K-token context window and can output up to 16K tokens. It accepts text input.

Compare head-to-head