Fastest Google Models

Quick answer · Updated June 2026

The fastest Google model is Nano Banana (Gemini 2.5 Flash Image) at 249 output tokens per second. Gemini 2.5 Flash Lite Preview 09-2025 (160 t/s) and Gemini 3.5 Flash (151 t/s) round out the top three.

249 t/sSpeed

$0.300Input /M

33KContext

AI models ranked by output speed (tokens per second, p50). The fastest large language models for low-latency and high-throughput applications.

1G
gemini-2.5-flash-image
JSONVisionImage out$0.300/M · 4.5s ttft · 33K ctx
249 t/s
Speed
2G
gemini-2.5-flash-lite-preview-09-2025
ReasoningToolsJSON+221.6 intel · $0.100/M · 510ms ttft
160 t/s
Speed
3G
gemini-3.5-flash
ReasoningToolsJSON+243.3 intel · $1.50/M · 2.0s ttft
151 t/s
Speed
4G
gemini-3.1-flash-image-preview
ReasoningJSONVision+1$0.500/M · 10.5s ttft · 131K ctx
138 t/s
Speed
5G
gemini-2.5-flash-lite
ReasoningToolsJSON+217.6 intel · $0.100/M · 393ms ttft
108 t/s
Speed
6G
gemini-3.1-flash-lite-preview
ReasoningToolsJSON+233.5 intel · $0.250/M · 619ms ttft
105 t/s
Speed
7G
gemini-3.1-flash-lite
ReasoningToolsJSON+2$0.250/M · 639ms ttft · 1.0M ctx
103 t/s
Speed
8G
gemini-2.5-pro
ReasoningToolsJSON+234.6 intel · $1.25/M · 989ms ttft
97 t/s
Speed
9G
gemini-2.5-pro-preview
ReasoningToolsJSON+2$1.25/M · 989ms ttft · 1.0M ctx
97 t/s
Speed
10G
gemini-2.5-pro-preview-05-06
ReasoningToolsJSON+2$1.25/M · 989ms ttft · 1.0M ctx
97 t/s
Speed
11G
gemini-3.1-pro-preview
ReasoningToolsJSON+241.3 intel · $2.00/M · 3.4s ttft
89 t/s
Speed
12G
gemini-3-pro-image-preview
ReasoningJSONVision+1$2.00/M · 3.5s ttft · 66K ctx
76 t/s
Speed
13G
gemini-2.5-flash
ReasoningToolsJSON+2$0.300/M · 618ms ttft · 1.0M ctx
74 t/s
Speed
14G
gemini-3-flash-preview
ReasoningToolsJSON+246.4 intel · $0.500/M · 1.3s ttft
71 t/s
Speed
15G
gemma-4-31b-it:free
ReasoningToolsJSON+139.2 intel · Free/M · 269ms ttft
64 t/s
Speed
16G
gemma-4-31b-it
ReasoningToolsJSON+139.2 intel · $0.120/M · 269ms ttft
64 t/s
Speed
17G
gemini-3.1-pro-preview-customtools
ReasoningToolsJSON+2$2.00/M · 3.2s ttft · 1.0M ctx
62 t/s
Speed
18G
gemma-2-27b-it
JSON$0.650/M · 824ms ttft · 8K ctx
48 t/s
Speed
19G
gemma-3-27b-it
ToolsJSONVision10.3 intel · $0.080/M · 498ms ttft
45 t/s
Speed
20G
gemma-4-26b-a4b-it:free
ReasoningToolsJSON+131.2 intel · Free/M · 519ms ttft
41 t/s
Speed
21G
gemma-4-26b-a4b-it
ReasoningToolsJSON+131.2 intel · $0.060/M · 519ms ttft
41 t/s
Speed
22G
gemma-3n-e4b-it
$0.060/M · 275ms ttft · 33K ctx
32 t/s
Speed
23G
gemma-3-12b-it
ToolsJSONVision8.8 intel · $0.050/M · 632ms ttft
25 t/s
Speed
24G
gemma-3-4b-it
JSONVision6.3 intel · $0.050/M · 492ms ttft
21 t/s
Speed
25G
lyria-3-pro-preview
JSONVisionFree/M · 6.5s ttft · 1.0M ctx
5 t/s
Speed

Frequently asked

What is the fastest Google model?

What's a good alternative to Nano Banana (Gemini 2.5 Flash Image)?

Gemini 2.5 Flash Lite Preview 09-2025 (160 t/s) is the closest alternative on this metric, followed by Gemini 3.5 Flash (151 t/s). See the full ranking above for the tradeoffs.

How many Google models are there?

modelgrep tracks 26 Google models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by Gemini 3 Flash Preview. 25 of them qualify for this ranking.

More Google rankings

Google: Smartest LLMs Google: Best LLMs for Coding Google: Best LLMs for Design & Frontend Google: Lowest-Latency LLMs Google: Cheapest LLMs Google: Best Free LLMs Google: Best Reasoning LLMs Google: Best Vision LLMs Google: Best LLMs for Agents Google: Best Open-Source LLMs Google: Longest-Context LLMs

All rankings

Small & Fast LLMs Smartest LLMs Best LLMs for Coding Best LLMs for Design & Frontend Lowest-Latency LLMs Cheapest LLMs Best Free LLMs Best Reasoning LLMs Best Vision LLMs Best LLMs for Agents Best Open-Source LLMs Longest-Context LLMs