Lowest-Latency Xiaomi Models

Quick answer · Updated June 2026

MiMo-V2.5-Pro has the lowest latency of any Xiaomi model, responding in about 198ms to first token. MiMo-V2-Flash (536ms) and MiMo-V2.5 (2.4s) round out the top three.

198msLatency

53.8Intelligence

31 t/sSpeed

$0.435Input /M

1.0MContext

AI models ranked by time-to-first-token (p50). The most responsive large language models for real-time and interactive use cases.

Frequently asked

Which Xiaomi model has the lowest latency?

MiMo-V2.5-Pro has the lowest latency of any Xiaomi model, responding in about 198ms to first token. MiMo-V2-Flash (536ms) and MiMo-V2.5 (2.4s) round out the top three.

What's a good alternative to MiMo-V2.5-Pro?

MiMo-V2-Flash (536ms) is the closest alternative on this metric, followed by MiMo-V2.5 (2.4s). See the full ranking above for the tradeoffs.

How many Xiaomi models are there?

modelgrep tracks 3 Xiaomi models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by MiMo-V2.5-Pro. 3 of them qualify for this ranking.

More Xiaomi rankings

Xiaomi: Smartest LLMs Xiaomi: Best LLMs for Coding Xiaomi: Best LLMs for Design & Frontend Xiaomi: Fastest LLMs Xiaomi: Cheapest LLMs Xiaomi: Best Free LLMs Xiaomi: Best Reasoning LLMs Xiaomi: Best Vision LLMs Xiaomi: Best LLMs for Agents Xiaomi: Best Open-Source LLMs Xiaomi: Longest-Context LLMs

All rankings

Small & Fast LLMs Smartest LLMs Best LLMs for Coding Best LLMs for Design & Frontend Fastest LLMs Cheapest LLMs Best Free LLMs Best Reasoning LLMs Best Vision LLMs Best LLMs for Agents Best Open-Source LLMs Longest-Context LLMs