Lowest-Latency MiniMax Models

Quick answer · Updated June 2026

MiniMax M2 has the lowest latency of any MiniMax model, responding in about 340ms to first token. MiniMax M2.7 (465ms) and MiniMax M2.5 (532ms) round out the top three.

340msLatency

36.1Intelligence

103 t/sSpeed

$0.255Input /M

205KContext

AI models ranked by time-to-first-token (p50). The most responsive large language models for real-time and interactive use cases.

Frequently asked

Which MiniMax model has the lowest latency?

MiniMax M2 has the lowest latency of any MiniMax model, responding in about 340ms to first token. MiniMax M2.7 (465ms) and MiniMax M2.5 (532ms) round out the top three.

What's a good alternative to MiniMax M2?

MiniMax M2.7 (465ms) is the closest alternative on this metric, followed by MiniMax M2.5 (532ms). See the full ranking above for the tradeoffs.

How many MiniMax models are there?

modelgrep tracks 8 MiniMax models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by MiniMax M3. 8 of them qualify for this ranking.

More MiniMax rankings

MiniMax: Smartest LLMs MiniMax: Best LLMs for Coding MiniMax: Best LLMs for Design & Frontend MiniMax: Fastest LLMs MiniMax: Cheapest LLMs MiniMax: Best Free LLMs MiniMax: Best Reasoning LLMs MiniMax: Best Vision LLMs MiniMax: Best LLMs for Agents MiniMax: Best Open-Source LLMs MiniMax: Longest-Context LLMs

All rankings

Small & Fast LLMs Smartest LLMs Best LLMs for Coding Best LLMs for Design & Frontend Fastest LLMs Cheapest LLMs Best Free LLMs Best Reasoning LLMs Best Vision LLMs Best LLMs for Agents Best Open-Source LLMs Longest-Context LLMs