modelgrep
Z

Z.ai: GLM 4.6V

z-ai/glm-4.6v

105th smartest of 178ReasoningToolsJSONVision
Use via OpenRouter ↗
Intelligence
23.4
105th of 178
Design Elo
Speed
65
123rd fastest
Latency
1.3s
first token
Input price
$0.300
132nd cheapest
Context
131K
33K max out

How it compares

Smarter than41%
of all ranked models
Faster than58%
of all ranked models
Cheaper than56%
of all ranked models

Overview

GLM-4.6V is a large multimodal model designed for high-fidelity visual understanding and long-context reasoning across images, documents, and mixed media. It supports up to 128K tokens, processes complex page layouts...

Benchmarks

independent · via OpenRouter
Artificial Analysis39th percentile
Intelligence Index
23.4
Coding Index
19.7
Agentic Index
17.5
GPQA Diamond
72%
Humanity's Last Exam
9%
SciCode
30%
Tau²-Bench (agentic)
32%

Providers & pricing (2)

ProviderIn $/MOut $/MUptime
Novitabf16$0.300$0.90099.4%
Z.AIfp8cache$0.300$0.900

Specifications

Context window131K
Max output33K
Knowledge cutoff
Input modalitiesimage, text, video
Output modalitiestext
Prompt cachingSupported
Cache read price$0.055/M
ModeratedNo

GLM 4.6V FAQ

How much does GLM 4.6V cost?

GLM 4.6V costs $0.300 per million input tokens and $0.900 per million output tokens via OpenRouter, making it 132nd cheapest of 298 paid models.

How smart is GLM 4.6V?

GLM 4.6V scores 23.4 on the Artificial Analysis Intelligence Index, ranking 105th of 178 benchmarked models, with a GPQA Diamond score of 72%.

How fast is GLM 4.6V?

GLM 4.6V generates around 65 tokens per second with 1.3s time-to-first-token (p50), the 123rd fastest tracked model.

What is GLM 4.6V's context window?

GLM 4.6V supports a 131K-token context window and can output up to 33K tokens. It accepts image, text, video input.

Compare head-to-head