modelgrep

Fastest Meta Models

Quick answer · Updated June 2026

The fastest Meta model is Llama 3.2 1B Instruct at 169 output tokens per second. Llama 3.1 8B Instruct (145 t/s) and Llama 4 Scout (130 t/s) round out the top three.

169 t/sSpeed
6.3Intelligence
$0.027Input /M
131KContext

AI models ranked by output speed (tokens per second, p50). The fastest large language models for low-latency and high-throughput applications.

  1. 1M
    llama-3.2-1b-instruct
    6.3 intel · $0.027/M · 332ms ttft
    169 t/s
    Speed
  2. 2M
    llama-3.1-8b-instruct
    ToolsJSON11.8 intel · $0.020/M · 143ms ttft
    145 t/s
    Speed
  3. 3M
    llama-4-scout
    ToolsJSONVision13.5 intel · $0.100/M · 249ms ttft
    130 t/s
    Speed
  4. 4M
    llama-3.3-70b-instruct:free
    Tools14.5 intel · Free/M · 244ms ttft
    115 t/s
    Speed
  5. 5M
    llama-3.3-70b-instruct
    ToolsJSON14.5 intel · $0.100/M · 244ms ttft
    115 t/s
    Speed
  6. 6M
    llama-3.2-3b-instruct:free
    Free/M · 223ms ttft · 131K ctx
    102 t/s
    Speed
  7. 7M
    llama-3.2-3b-instruct
    $0.051/M · 223ms ttft · 131K ctx
    102 t/s
    Speed
  8. 8M
    llama-4-maverick
    ToolsJSONVision18.4 intel · $0.150/M · 303ms ttft
    72 t/s
    Speed
  9. 9M
    llama-3-8b-instruct
    6.4 intel · $0.140/M · 660ms ttft
    63 t/s
    Speed
  10. 10M
    llama-3.2-11b-vision-instruct
    JSONVision8.7 intel · $0.345/M · 164ms ttft
    35 t/s
    Speed
  11. 11M
    llama-3.1-70b-instruct
    ToolsJSON12.5 intel · $0.400/M · 303ms ttft
    28 t/s
    Speed
  12. 12M
    llama-guard-4-12b
    JSONVision$0.180/M · 120ms ttft · 164K ctx
    18 t/s
    Speed
  13. 13M
    llama-3-70b-instruct
    JSON8.9 intel · $0.510/M · 1.3s ttft
    18 t/s
    Speed

Frequently asked

What is the fastest Meta model?

The fastest Meta model is Llama 3.2 1B Instruct at 169 output tokens per second. Llama 3.1 8B Instruct (145 t/s) and Llama 4 Scout (130 t/s) round out the top three.

What's a good alternative to Llama 3.2 1B Instruct?

Llama 3.1 8B Instruct (145 t/s) is the closest alternative on this metric, followed by Llama 4 Scout (130 t/s). See the full ranking above for the tradeoffs.

How many Meta models are there?

modelgrep tracks 13 Meta models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by Llama 4 Maverick. 13 of them qualify for this ranking.

More Meta rankings

All rankings