Fastest LLMs

Match · Updated July 2026

AI models ranked by output speed (tokens per second, p50). The fastest large language models — and the fastest AI models overall — for low-latency and high-throughput applications.

Benchmark data for this ranking is temporarily unavailable — check back shortly.

By maker

OpenAI Qwen Google Anthropic Mistral DeepSeek Z.ai NVIDIA

All rankings

Small & Fast LLMs Best Local LLMs Smartest LLMs Best LLMs for Coding Best LLMs for Design & Frontend Lowest-Latency LLMs Cheapest LLMs Best Free LLMs Best Reasoning LLMs Best Vision LLMs Best LLMs for Agents Best Open-Source LLMs Longest-Context LLMs Best LLMs for Writing Best LLMs for Math & Science Best LLMs for RAG Best LLMs for SQL & Data Analysis Best LLMs for Roleplay Best Uncensored LLMs