Z.ai: GLM 4.5 Air

z-ai/glm-4.5-air

107th smartest of 178Cheaper than 77% of paidReasoningToolsJSON

Use via OpenRouter ↗

Intelligence

23.2

107th of 178

Design Elo

1237

Data Viz

Speed

176th fastest

Latency

760ms

first token

Input price

$0.125

68th cheapest

Context

131K

131K max out

How it compares

Smarter than40%

of all ranked models

Faster than40%

of all ranked models

Cheaper than77%

of all ranked models

Overview

GLM-4.5-Air is the lightweight variant of our latest flagship model family, also purpose-built for agent-centric applications. Like GLM-4.5, it adopts the Mixture-of-Experts (MoE) architecture but with a more compact parameter...

Benchmarks

independent · via OpenRouter

Artificial Analysis39th percentile

Intelligence Index

23.2

Coding Index

23.8

Agentic Index

21.0

GPQA Diamond

73%

Humanity's Last Exam

SciCode

31%

Tau²-Bench (agentic)

47%

Design Arena · Elo8,103 tournaments

Data Viz

1237

1204

Website

1191

codecategories

1189

UI Component

1181

Game Dev

1163

svg

1128

Providers & pricing (4)

Provider	In $/M	Out $/M	Context	Uptime
Io Netfp8	$0.125	$0.850	131K	100%
Novitabf16	$0.130	$0.850	131K	100%
SiliconFlowfp8	$0.140	$0.860	131K	100%
Z.AIfp8	$0.200	$1.10	131K	98.9%

Specifications

Context window131K

Max output131K

Knowledge cutoffDec 2024

Input modalitiestext

Output modalitiestext

Prompt caching—

Cache read price$0.060/M

ModeratedNo

Open weightszai-org/GLM-4.5-Air ↗

GLM 4.5 Air FAQ

How much does GLM 4.5 Air cost?

GLM 4.5 Air costs $0.125 per million input tokens and $0.850 per million output tokens via OpenRouter, making it 68th cheapest of 298 paid models.

How smart is GLM 4.5 Air?

GLM 4.5 Air scores 23.2 on the Artificial Analysis Intelligence Index, ranking 107th of 178 benchmarked models, with a GPQA Diamond score of 73%.

How fast is GLM 4.5 Air?

GLM 4.5 Air generates around 48 tokens per second with 760ms time-to-first-token (p50), the 176th fastest tracked model.

What is GLM 4.5 Air's context window?

GLM 4.5 Air supports a 131K-token context window and can output up to 131K tokens. It accepts text input.

Similar models

All GLM 4.5 Air alternatives →

mistralai/mistral-large-2512

Intel 22.8$0.500/M

qwen/qwen3-30b-a3b-thinking-2507

Intel 22.4$0.080/M

deepseek/deepseek-chat-v3-0324

Intel 22.3$0.200/M

Compare head-to-head

Z.ai: GLM 4.5 Air vs OpenAI: GPT-5.4 Mini Z.ai: GLM 4.5 Air vs Z.ai: GLM 4.6V Z.ai: GLM 4.5 Air vs OpenAI: GPT-4.1 Mini Z.ai: GLM 4.5 Air vs Mistral: Mistral Large 3 2512 Z.ai: GLM 4.5 Air vs Qwen: Qwen3 30B A3B Thinking 2507 Z.ai: GLM 4.5 Air vs DeepSeek: DeepSeek V3 0324