modelgrep

Lowest-Latency MiniMax Models

Quick answer · Updated June 2026

MiniMax M2 has the lowest latency of any MiniMax model, responding in about 340ms to first token. MiniMax M2.7 (465ms) and MiniMax M2.5 (532ms) round out the top three.

340msLatency
36.1Intelligence
103 t/sSpeed
$0.255Input /M
205KContext

AI models ranked by time-to-first-token (p50). The most responsive large language models for real-time and interactive use cases.

  1. 1M
    minimax-m2
    ReasoningToolsJSON36.1 intel · $0.255/M · 103 t/s
    340ms
    Latency
  2. 2M
    minimax-m2.7
    ReasoningToolsJSON49.6 intel · $0.250/M · 273 t/s
    465ms
    Latency
  3. 3M
    minimax-m2.5
    ReasoningToolsJSON41.9 intel · $0.150/M · 183 t/s
    532ms
    Latency
  4. 4M
    minimax-m3
    ReasoningToolsJSON+154.7 intel · $0.300/M · 42 t/s
    689ms
    Latency
  5. 5M
    minimax-m2.1
    ReasoningToolsJSON39.4 intel · $0.290/M · 149 t/s
    769ms
    Latency
  6. 6M
    minimax-01
    Vision$0.200/M · 34 t/s · 1.0M ctx
    823ms
    Latency
  7. 7M
    minimax-m1
    ReasoningTools$0.400/M · 18 t/s · 1M ctx
    840ms
    Latency
  8. 8M
    minimax-m2-her
    $0.300/M · 19 t/s · 66K ctx
    915ms
    Latency

Frequently asked

Which MiniMax model has the lowest latency?

MiniMax M2 has the lowest latency of any MiniMax model, responding in about 340ms to first token. MiniMax M2.7 (465ms) and MiniMax M2.5 (532ms) round out the top three.

What's a good alternative to MiniMax M2?

MiniMax M2.7 (465ms) is the closest alternative on this metric, followed by MiniMax M2.5 (532ms). See the full ranking above for the tradeoffs.

How many MiniMax models are there?

modelgrep tracks 8 MiniMax models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by MiniMax M3. 8 of them qualify for this ranking.

More MiniMax rankings

All rankings