Grok 4.3 has the lowest latency of any xAI model, responding in about 675ms to first token. Grok 4.20 (707ms) and Grok Build 0.1 (1.1s) round out the top three.
AI models ranked by time-to-first-token (p50). The most responsive large language models for real-time and interactive use cases.
Grok 4.3 has the lowest latency of any xAI model, responding in about 675ms to first token. Grok 4.20 (707ms) and Grok Build 0.1 (1.1s) round out the top three.
Grok 4.20 (707ms) is the closest alternative on this metric, followed by Grok Build 0.1 (1.1s). See the full ranking above for the tradeoffs.
modelgrep tracks 4 xAI models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by Grok 4.3. 4 of them qualify for this ranking.