Fastest Mistral Models

Quick answer · Updated June 2026

The fastest Mistral model is Mistral Small 4 at 110 output tokens per second. Mixtral 8x22B Instruct (102 t/s) and Mistral Small 3.2 24B (92 t/s) round out the top three.

110 t/sSpeed

18.6Intelligence

$0.150Input /M

262KContext

AI models ranked by output speed (tokens per second, p50). The fastest large language models for low-latency and high-throughput applications.

1M
mistral-small-2603
ReasoningToolsJSON+118.6 intel · $0.150/M · 385ms ttft
110 t/s
Speed
2M
mixtral-8x22b-instruct
ToolsJSON$2.00/M · 346ms ttft · 66K ctx
102 t/s
Speed
3M
mistral-small-3.2-24b-instruct
ToolsJSONVision$0.075/M · 314ms ttft · 128K ctx
92 t/s
Speed
4M
mistral-nemo
ToolsJSON$0.020/M · 276ms ttft · 131K ctx
75 t/s
Speed
5M
codestral-2508
ToolsJSON$0.300/M · 172ms ttft · 256K ctx
67 t/s
Speed
6M
voxtral-small-24b-2507
ToolsJSONAudio$0.100/M · 268ms ttft · 32K ctx
66 t/s
Speed
7M
mistral-saba
ToolsJSON$0.200/M · 295ms ttft · 33K ctx
58 t/s
Speed
8M
ministral-3b-2512
ToolsJSONVision11.2 intel · $0.100/M · 237ms ttft
55 t/s
Speed
9M
ministral-14b-2512
ToolsJSONVision16.0 intel · $0.200/M · 249ms ttft
50 t/s
Speed
10M
mistral-medium-3
ToolsJSONVision18.8 intel · $0.400/M · 397ms ttft
38 t/s
Speed
11M
mistral-large-2512
ToolsJSONVision22.8 intel · $0.500/M · 806ms ttft
37 t/s
Speed
12M
mistral-large
ToolsJSON$2.00/M · 683ms ttft · 128K ctx
33 t/s
Speed
13M
mistral-small-24b-instruct-2501
JSON$0.050/M · 479ms ttft · 33K ctx
31 t/s
Speed
14M
mistral-large-2407
ToolsJSON$2.00/M · 974ms ttft · 131K ctx
30 t/s
Speed
15M
ministral-8b-2512
ToolsJSONVision14.8 intel · $0.150/M · 243ms ttft
11 t/s
Speed
16M
devstral-2512
ToolsJSON22.0 intel · $0.400/M · 917ms ttft
10 t/s
Speed

Frequently asked

What is the fastest Mistral model?

The fastest Mistral model is Mistral Small 4 at 110 output tokens per second. Mixtral 8x22B Instruct (102 t/s) and Mistral Small 3.2 24B (92 t/s) round out the top three.

What's a good alternative to Mistral Small 4?

Mixtral 8x22B Instruct (102 t/s) is the closest alternative on this metric, followed by Mistral Small 3.2 24B (92 t/s). See the full ranking above for the tradeoffs.

How many Mistral models are there?

modelgrep tracks 19 Mistral models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by Mistral Medium 3.5. 16 of them qualify for this ranking.

More Mistral rankings

Mistral: Smartest LLMs Mistral: Best LLMs for Coding Mistral: Best LLMs for Design & Frontend Mistral: Lowest-Latency LLMs Mistral: Cheapest LLMs Mistral: Best Free LLMs Mistral: Best Reasoning LLMs Mistral: Best Vision LLMs Mistral: Best LLMs for Agents Mistral: Best Open-Source LLMs Mistral: Longest-Context LLMs

All rankings

Small & Fast LLMs Smartest LLMs Best LLMs for Coding Best LLMs for Design & Frontend Lowest-Latency LLMs Cheapest LLMs Best Free LLMs Best Reasoning LLMs Best Vision LLMs Best LLMs for Agents Best Open-Source LLMs Longest-Context LLMs