Fastest Xiaomi Models

Quick answer · Updated June 2026

The fastest Xiaomi model is MiMo-V2-Flash at 83 output tokens per second. MiMo-V2.5 (59 t/s) and MiMo-V2.5-Pro (38 t/s) round out the top three.

83 t/sSpeed

30.3Intelligence

$0.100Input /M

262KContext

AI models ranked by output speed (tokens per second, p50). The fastest large language models for low-latency and high-throughput applications.

1X
mimo-v2-flash
ReasoningToolsJSON30.3 intel · $0.100/M · 600ms ttft
83 t/s
Speed
2X
mimo-v2.5
ReasoningToolsJSON+249.0 intel · $0.140/M · 1.7s ttft
59 t/s
Speed
3X
mimo-v2.5-pro
ReasoningToolsJSON53.8 intel · $0.435/M · 195ms ttft
38 t/s
Speed

Frequently asked

What is the fastest Xiaomi model?

The fastest Xiaomi model is MiMo-V2-Flash at 83 output tokens per second. MiMo-V2.5 (59 t/s) and MiMo-V2.5-Pro (38 t/s) round out the top three.

What's a good alternative to MiMo-V2-Flash?

MiMo-V2.5 (59 t/s) is the closest alternative on this metric, followed by MiMo-V2.5-Pro (38 t/s). See the full ranking above for the tradeoffs.

How many Xiaomi models are there?

modelgrep tracks 3 Xiaomi models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by MiMo-V2.5-Pro. 3 of them qualify for this ranking.

More Xiaomi rankings

Xiaomi: Smartest LLMs Xiaomi: Best LLMs for Coding Xiaomi: Best LLMs for Design & Frontend Xiaomi: Lowest-Latency LLMs Xiaomi: Cheapest LLMs Xiaomi: Best Free LLMs Xiaomi: Best Reasoning LLMs Xiaomi: Best Vision LLMs Xiaomi: Best LLMs for Agents Xiaomi: Best Open-Source LLMs Xiaomi: Longest-Context LLMs

All rankings

Small & Fast LLMs Smartest LLMs Best LLMs for Coding Best LLMs for Design & Frontend Lowest-Latency LLMs Cheapest LLMs Best Free LLMs Best Reasoning LLMs Best Vision LLMs Best LLMs for Agents Best Open-Source LLMs Longest-Context LLMs