modelgrep

Lowest-Latency Amazon Models

Quick answer · Updated June 2026

Nova Micro 1.0 has the lowest latency of any Amazon model, responding in about 322ms to first token. Nova Lite 1.0 (482ms) and Nova 2 Lite (544ms) round out the top three.

322msLatency
10.3Intelligence
97 t/sSpeed
$0.035Input /M
128KContext

AI models ranked by time-to-first-token (p50). The most responsive large language models for real-time and interactive use cases.

  1. 1A
    nova-micro-v1
    Tools10.3 intel · $0.035/M · 97 t/s
    322ms
    Latency
  2. 2A
    nova-lite-v1
    ToolsVision12.7 intel · $0.060/M · 76 t/s
    482ms
    Latency
  3. 3A
    nova-2-lite-v1
    ReasoningToolsVision24.6 intel · $0.300/M · 119 t/s
    544ms
    Latency
  4. 4A
    nova-pro-v1
    ToolsVision13.5 intel · $0.800/M · 41 t/s
    965ms
    Latency
  5. 5A
    nova-premier-v1
    ToolsVision19.0 intel · $2.50/M · 12 t/s
    13.2s
    Latency

Frequently asked

Which Amazon model has the lowest latency?

Nova Micro 1.0 has the lowest latency of any Amazon model, responding in about 322ms to first token. Nova Lite 1.0 (482ms) and Nova 2 Lite (544ms) round out the top three.

What's a good alternative to Nova Micro 1.0?

Nova Lite 1.0 (482ms) is the closest alternative on this metric, followed by Nova 2 Lite (544ms). See the full ranking above for the tradeoffs.

How many Amazon models are there?

modelgrep tracks 5 Amazon models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by Nova 2 Lite. 5 of them qualify for this ranking.

More Amazon rankings

All rankings