z-ai/glm-4.6
Compared with GLM-4.5, this generation brings several key improvements: Longer context window: The context window has been expanded from 128K to 200K tokens, enabling the model to handle more complex...
| Provider | In $/M | Out $/M | Context | Uptime |
|---|---|---|---|---|
| DeepInfrafp4 | $0.430 | $1.74 | 203K | 99.5% |
| Novitabf16 | $0.550 | $2.20 | 205K | 100% |
| Z.AIfp4 | $0.600 | $2.20 | 203K | 87.7% |
| AtlasCloudfp8 | $0.600 | $2.20 | 203K | 100% |
| Venicefp4 | $0.850 | $2.75 | 198K | — |
GLM 4.6 costs $0.430 per million input tokens and $1.74 per million output tokens via OpenRouter, making it 155th cheapest of 298 paid models.
GLM 4.6 scores 30.2 on the Artificial Analysis Intelligence Index, ranking 78th of 180 benchmarked models, with a GPQA Diamond score of 63%.
GLM 4.6 generates around 39 tokens per second with 585ms time-to-first-token (p50), the 207th fastest tracked model.
GLM 4.6 supports a 203K-token context window and can output up to 131K tokens. It accepts text input.