modelgrep

Fastest Mistral Models

Quick answer · Updated June 2026

The fastest Mistral model is Mistral Small 4 at 110 output tokens per second. Mixtral 8x22B Instruct (102 t/s) and Mistral Small 3.2 24B (92 t/s) round out the top three.

110 t/sSpeed
18.6Intelligence
$0.150Input /M
262KContext

AI models ranked by output speed (tokens per second, p50). The fastest large language models for low-latency and high-throughput applications.

  1. 1M
    mistral-small-2603
    ReasoningToolsJSON+118.6 intel · $0.150/M · 385ms ttft
    110 t/s
    Speed
  2. 2M
    mixtral-8x22b-instruct
    ToolsJSON$2.00/M · 346ms ttft · 66K ctx
    102 t/s
    Speed
  3. 3M
    mistral-small-3.2-24b-instruct
    ToolsJSONVision$0.075/M · 314ms ttft · 128K ctx
    92 t/s
    Speed
  4. 4M
    mistral-nemo
    ToolsJSON$0.020/M · 276ms ttft · 131K ctx
    75 t/s
    Speed
  5. 5M
    codestral-2508
    ToolsJSON$0.300/M · 172ms ttft · 256K ctx
    67 t/s
    Speed
  6. 6M
    voxtral-small-24b-2507
    ToolsJSONAudio$0.100/M · 268ms ttft · 32K ctx
    66 t/s
    Speed
  7. 7M
    mistral-saba
    ToolsJSON$0.200/M · 295ms ttft · 33K ctx
    58 t/s
    Speed
  8. 8M
    ministral-3b-2512
    ToolsJSONVision11.2 intel · $0.100/M · 237ms ttft
    55 t/s
    Speed
  9. 9M
    ministral-14b-2512
    ToolsJSONVision16.0 intel · $0.200/M · 249ms ttft
    50 t/s
    Speed
  10. 10M
    mistral-medium-3
    ToolsJSONVision18.8 intel · $0.400/M · 397ms ttft
    38 t/s
    Speed
  11. 11M
    mistral-large-2512
    ToolsJSONVision22.8 intel · $0.500/M · 806ms ttft
    37 t/s
    Speed
  12. 12M
    mistral-large
    ToolsJSON$2.00/M · 683ms ttft · 128K ctx
    33 t/s
    Speed
  13. 13M
    mistral-small-24b-instruct-2501
    JSON$0.050/M · 479ms ttft · 33K ctx
    31 t/s
    Speed
  14. 14M
    mistral-large-2407
    ToolsJSON$2.00/M · 974ms ttft · 131K ctx
    30 t/s
    Speed
  15. 15M
    ministral-8b-2512
    ToolsJSONVision14.8 intel · $0.150/M · 243ms ttft
    11 t/s
    Speed
  16. 16M
    devstral-2512
    ToolsJSON22.0 intel · $0.400/M · 917ms ttft
    10 t/s
    Speed

Frequently asked

What is the fastest Mistral model?

The fastest Mistral model is Mistral Small 4 at 110 output tokens per second. Mixtral 8x22B Instruct (102 t/s) and Mistral Small 3.2 24B (92 t/s) round out the top three.

What's a good alternative to Mistral Small 4?

Mixtral 8x22B Instruct (102 t/s) is the closest alternative on this metric, followed by Mistral Small 3.2 24B (92 t/s). See the full ranking above for the tradeoffs.

How many Mistral models are there?

modelgrep tracks 19 Mistral models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by Mistral Medium 3.5. 16 of them qualify for this ranking.

More Mistral rankings

All rankings