modelgrep

Small & Fast Mistral Models

Quick answer · Updated June 2026

The small, fast Mistral model is Mistral Small 4 — the efficient tier at 110 tokens/sec and $0.150 per million input tokens. It trades a few points of raw intelligence for speed and cost, the right call for high-volume, latency-sensitive work. Mistral Small 3.2 24B (92 t/s) and Mistral Nemo (75 t/s) round out the top three.

110 t/sSpeed
18.6Intelligence
$0.150Input /M
262KContext

Compact, efficient models — the small/mini/flash/haiku tier — ranked by output speed. These trade a little raw intelligence for low cost and high throughput, which is the right tradeoff for chat, classification, extraction and other high-volume work.

  1. 1M
    mistral-small-2603
    ReasoningToolsJSON+118.6 intel · $0.150/M · 385ms ttft
    110 t/s
    Speed
  2. 2M
    mistral-small-3.2-24b-instruct
    ToolsJSONVision$0.075/M · 314ms ttft · 128K ctx
    92 t/s
    Speed
  3. 3M
    mistral-nemo
    ToolsJSON$0.020/M · 276ms ttft · 131K ctx
    75 t/s
    Speed
  4. 4M
    codestral-2508
    ToolsJSON$0.300/M · 172ms ttft · 256K ctx
    67 t/s
    Speed
  5. 5M
    voxtral-small-24b-2507
    ToolsJSONAudio$0.100/M · 268ms ttft · 32K ctx
    66 t/s
    Speed
  6. 6M
    mistral-saba
    ToolsJSON$0.200/M · 295ms ttft · 33K ctx
    58 t/s
    Speed
  7. 7M
    ministral-3b-2512
    ToolsJSONVision11.2 intel · $0.100/M · 237ms ttft
    55 t/s
    Speed
  8. 8M
    ministral-14b-2512
    ToolsJSONVision16.0 intel · $0.200/M · 249ms ttft
    50 t/s
    Speed
  9. 9M
    mistral-small-24b-instruct-2501
    JSON$0.050/M · 479ms ttft · 33K ctx
    31 t/s
    Speed
  10. 10M
    ministral-8b-2512
    ToolsJSONVision14.8 intel · $0.150/M · 243ms ttft
    11 t/s
    Speed

Frequently asked

What is the smallest, fastest Mistral model?

The small, fast Mistral model is Mistral Small 4 — the efficient tier at 110 tokens/sec and $0.150 per million input tokens. It trades a few points of raw intelligence for speed and cost, the right call for high-volume, latency-sensitive work. Mistral Small 3.2 24B (92 t/s) and Mistral Nemo (75 t/s) round out the top three.

What's a good alternative to Mistral Small 4?

Mistral Small 3.2 24B (92 t/s) is the closest alternative on this metric, followed by Mistral Nemo (75 t/s). See the full ranking above for the tradeoffs.

How many Mistral models are there?

modelgrep tracks 19 Mistral models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by Mistral Medium 3.5. 10 of them qualify for this ranking.

More Mistral rankings

All rankings