Lowest-Latency Anthropic Models

Quick answer · Updated June 2026

Claude Haiku 4.5 has the lowest latency of any Anthropic model, responding in about 521ms to first token. Claude 3 Haiku (542ms) and Claude Sonnet 4 (699ms) round out the top three.

521msLatency

31.0Intelligence

82 t/sSpeed

$1.00Input /M

200KContext

AI models ranked by time-to-first-token (p50). The most responsive large language models for real-time and interactive use cases.

1A
claude-haiku-4.5
ReasoningToolsJSON+131.0 intel · $1.00/M · 82 t/s
521ms
Latency
2A
claude-3-haiku
ToolsVision12.3 intel · $0.250/M · 68 t/s
542ms
Latency
3A
claude-sonnet-4
ReasoningToolsVision33.0 intel · $3.00/M · 49 t/s
699ms
Latency
4A
claude-opus-4.5
ReasoningToolsJSON+143.1 intel · $5.00/M · 60 t/s
777ms
Latency
5A
claude-3.5-haiku
ToolsVision18.7 intel · $0.800/M · 35 t/s
832ms
Latency
6A
claude-sonnet-4.5
ReasoningToolsJSON+137.1 intel · $3.00/M · 46 t/s
880ms
Latency
7A
claude-sonnet-4.6
ReasoningToolsJSON+142.6 intel · $3.00/M · 47 t/s
1.0s
Latency
8A
claude-opus-4.6-fast
ReasoningToolsJSON+1$30.00/M · 11 t/s · 1M ctx
1.3s
Latency
9A
claude-opus-4.6
ReasoningToolsJSON+152.9 intel · $5.00/M · 41 t/s
1.5s
Latency
10A
claude-opus-4.8-fast
ReasoningToolsJSON+1$10.00/M · 121 t/s · 1M ctx
1.6s
Latency
11A
claude-opus-4.7
ReasoningToolsJSON+157.3 intel · $5.00/M · 63 t/s
1.6s
Latency
12A
claude-opus-4.8
ReasoningToolsJSON+161.4 intel · $5.00/M · 59 t/s
1.8s
Latency
13A
claude-opus-4.1
ReasoningToolsJSON+1$15.00/M · 27 t/s · 200K ctx
2.1s
Latency
14A
claude-opus-4
ReasoningToolsVision$15.00/M · 10 t/s · 200K ctx
2.3s
Latency

Frequently asked

Which Anthropic model has the lowest latency?

Claude Haiku 4.5 has the lowest latency of any Anthropic model, responding in about 521ms to first token. Claude 3 Haiku (542ms) and Claude Sonnet 4 (699ms) round out the top three.

What's a good alternative to Claude Haiku 4.5?

Claude 3 Haiku (542ms) is the closest alternative on this metric, followed by Claude Sonnet 4 (699ms). See the full ranking above for the tradeoffs.

How many Anthropic models are there?

modelgrep tracks 16 Anthropic models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by Claude Fable 5. 14 of them qualify for this ranking.

More Anthropic rankings

Anthropic: Smartest LLMs Anthropic: Best LLMs for Coding Anthropic: Best LLMs for Design & Frontend Anthropic: Fastest LLMs Anthropic: Cheapest LLMs Anthropic: Best Free LLMs Anthropic: Best Reasoning LLMs Anthropic: Best Vision LLMs Anthropic: Best LLMs for Agents Anthropic: Best Open-Source LLMs Anthropic: Longest-Context LLMs

All rankings

Small & Fast LLMs Smartest LLMs Best LLMs for Coding Best LLMs for Design & Frontend Fastest LLMs Cheapest LLMs Best Free LLMs Best Reasoning LLMs Best Vision LLMs Best LLMs for Agents Best Open-Source LLMs Longest-Context LLMs