Lowest-Latency MoonshotAI Models

Quick answer · Updated June 2026

Kimi K2.5 has the lowest latency of any MoonshotAI model, responding in about 211ms to first token. Kimi K2 0905 (220ms) and Kimi K2.7 Code (378ms) round out the top three.

211msLatency

37.3Intelligence

89 t/sSpeed

$0.375Input /M

262KContext

AI models ranked by time-to-first-token (p50). The most responsive large language models for real-time and interactive use cases.

Frequently asked

Which MoonshotAI model has the lowest latency?

Kimi K2.5 has the lowest latency of any MoonshotAI model, responding in about 211ms to first token. Kimi K2 0905 (220ms) and Kimi K2.7 Code (378ms) round out the top three.

What's a good alternative to Kimi K2.5?

Kimi K2 0905 (220ms) is the closest alternative on this metric, followed by Kimi K2.7 Code (378ms). See the full ranking above for the tradeoffs.

How many MoonshotAI models are there?

modelgrep tracks 6 MoonshotAI models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by Kimi K2.6. 6 of them qualify for this ranking.

More MoonshotAI rankings

MoonshotAI: Smartest LLMs MoonshotAI: Best LLMs for Coding MoonshotAI: Best LLMs for Design & Frontend MoonshotAI: Fastest LLMs MoonshotAI: Cheapest LLMs MoonshotAI: Best Free LLMs MoonshotAI: Best Reasoning LLMs MoonshotAI: Best Vision LLMs MoonshotAI: Best LLMs for Agents MoonshotAI: Best Open-Source LLMs MoonshotAI: Longest-Context LLMs

All rankings

Small & Fast LLMs Smartest LLMs Best LLMs for Coding Best LLMs for Design & Frontend Fastest LLMs Cheapest LLMs Best Free LLMs Best Reasoning LLMs Best Vision LLMs Best LLMs for Agents Best Open-Source LLMs Longest-Context LLMs