Lowest-Latency Mistral Models

Quick answer · Updated June 2026

Codestral 2508 has the lowest latency of any Mistral model, responding in about 136ms to first token. Ministral 3 3B 2512 (191ms) and Ministral 3 8B 2512 (241ms) round out the top three.

136msLatency

152 t/sSpeed

$0.300Input /M

256KContext

AI models ranked by time-to-first-token (p50). The most responsive large language models for real-time and interactive use cases.

Frequently asked

Which Mistral model has the lowest latency?

Codestral 2508 has the lowest latency of any Mistral model, responding in about 136ms to first token. Ministral 3 3B 2512 (191ms) and Ministral 3 8B 2512 (241ms) round out the top three.

What's a good alternative to Codestral 2508?

Ministral 3 3B 2512 (191ms) is the closest alternative on this metric, followed by Ministral 3 8B 2512 (241ms). See the full ranking above for the tradeoffs.

How many Mistral models are there?

modelgrep tracks 19 Mistral models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by Mistral Medium 3.5. 19 of them qualify for this ranking.

More Mistral rankings

Mistral: Smartest LLMs Mistral: Best LLMs for Coding Mistral: Best LLMs for Design & Frontend Mistral: Fastest LLMs Mistral: Cheapest LLMs Mistral: Best Free LLMs Mistral: Best Reasoning LLMs Mistral: Best Vision LLMs Mistral: Best LLMs for Agents Mistral: Best Open-Source LLMs Mistral: Longest-Context LLMs

All rankings

Small & Fast LLMs Smartest LLMs Best LLMs for Coding Best LLMs for Design & Frontend Fastest LLMs Cheapest LLMs Best Free LLMs Best Reasoning LLMs Best Vision LLMs Best LLMs for Agents Best Open-Source LLMs Longest-Context LLMs