Lowest-Latency xAI Models

Quick answer · Updated June 2026

Grok 4.3 has the lowest latency of any xAI model, responding in about 675ms to first token. Grok 4.20 (707ms) and Grok Build 0.1 (1.1s) round out the top three.

675msLatency

53.2Intelligence

127 t/sSpeed

$1.25Input /M

1MContext

AI models ranked by time-to-first-token (p50). The most responsive large language models for real-time and interactive use cases.

Frequently asked

Which xAI model has the lowest latency?

Grok 4.3 has the lowest latency of any xAI model, responding in about 675ms to first token. Grok 4.20 (707ms) and Grok Build 0.1 (1.1s) round out the top three.

What's a good alternative to Grok 4.3?

Grok 4.20 (707ms) is the closest alternative on this metric, followed by Grok Build 0.1 (1.1s). See the full ranking above for the tradeoffs.

How many xAI models are there?

modelgrep tracks 4 xAI models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by Grok 4.3. 4 of them qualify for this ranking.

More xAI rankings

xAI: Smartest LLMs xAI: Best LLMs for Coding xAI: Best LLMs for Design & Frontend xAI: Fastest LLMs xAI: Cheapest LLMs xAI: Best Free LLMs xAI: Best Reasoning LLMs xAI: Best Vision LLMs xAI: Best LLMs for Agents xAI: Best Open-Source LLMs xAI: Longest-Context LLMs

All rankings

Small & Fast LLMs Smartest LLMs Best LLMs for Coding Best LLMs for Design & Frontend Fastest LLMs Cheapest LLMs Best Free LLMs Best Reasoning LLMs Best Vision LLMs Best LLMs for Agents Best Open-Source LLMs Longest-Context LLMs