modelgrep

Fastest Google Models

Quick answer · Updated June 2026

The fastest Google model is Nano Banana (Gemini 2.5 Flash Image) at 249 output tokens per second. Gemini 2.5 Flash Lite Preview 09-2025 (160 t/s) and Gemini 3.5 Flash (151 t/s) round out the top three.

249 t/sSpeed
$0.300Input /M
33KContext

AI models ranked by output speed (tokens per second, p50). The fastest large language models for low-latency and high-throughput applications.

  1. 1G
    gemini-2.5-flash-image
    JSONVisionImage out$0.300/M · 4.5s ttft · 33K ctx
    249 t/s
    Speed
  2. 2G
    gemini-2.5-flash-lite-preview-09-2025
    ReasoningToolsJSON+221.6 intel · $0.100/M · 510ms ttft
    160 t/s
    Speed
  3. 3G
    gemini-3.5-flash
    ReasoningToolsJSON+243.3 intel · $1.50/M · 2.0s ttft
    151 t/s
    Speed
  4. 4G
    gemini-3.1-flash-image-preview
    ReasoningJSONVision+1$0.500/M · 10.5s ttft · 131K ctx
    138 t/s
    Speed
  5. 5G
    gemini-2.5-flash-lite
    ReasoningToolsJSON+217.6 intel · $0.100/M · 393ms ttft
    108 t/s
    Speed
  6. 6G
    gemini-3.1-flash-lite-preview
    ReasoningToolsJSON+233.5 intel · $0.250/M · 619ms ttft
    105 t/s
    Speed
  7. 7G
    gemini-3.1-flash-lite
    ReasoningToolsJSON+2$0.250/M · 639ms ttft · 1.0M ctx
    103 t/s
    Speed
  8. 8G
    gemini-2.5-pro
    ReasoningToolsJSON+234.6 intel · $1.25/M · 989ms ttft
    97 t/s
    Speed
  9. 9G
    gemini-2.5-pro-preview
    ReasoningToolsJSON+2$1.25/M · 989ms ttft · 1.0M ctx
    97 t/s
    Speed
  10. 10G
    gemini-2.5-pro-preview-05-06
    ReasoningToolsJSON+2$1.25/M · 989ms ttft · 1.0M ctx
    97 t/s
    Speed
  11. 11G
    gemini-3.1-pro-preview
    ReasoningToolsJSON+241.3 intel · $2.00/M · 3.4s ttft
    89 t/s
    Speed
  12. 12G
    gemini-3-pro-image-preview
    ReasoningJSONVision+1$2.00/M · 3.5s ttft · 66K ctx
    76 t/s
    Speed
  13. 13G
    gemini-2.5-flash
    ReasoningToolsJSON+2$0.300/M · 618ms ttft · 1.0M ctx
    74 t/s
    Speed
  14. 14G
    gemini-3-flash-preview
    ReasoningToolsJSON+246.4 intel · $0.500/M · 1.3s ttft
    71 t/s
    Speed
  15. 15G
    gemma-4-31b-it:free
    ReasoningToolsJSON+139.2 intel · Free/M · 269ms ttft
    64 t/s
    Speed
  16. 16G
    gemma-4-31b-it
    ReasoningToolsJSON+139.2 intel · $0.120/M · 269ms ttft
    64 t/s
    Speed
  17. 17G
    gemini-3.1-pro-preview-customtools
    ReasoningToolsJSON+2$2.00/M · 3.2s ttft · 1.0M ctx
    62 t/s
    Speed
  18. 18G
    gemma-2-27b-it
    JSON$0.650/M · 824ms ttft · 8K ctx
    48 t/s
    Speed
  19. 19G
    gemma-3-27b-it
    ToolsJSONVision10.3 intel · $0.080/M · 498ms ttft
    45 t/s
    Speed
  20. 20G
    gemma-4-26b-a4b-it:free
    ReasoningToolsJSON+131.2 intel · Free/M · 519ms ttft
    41 t/s
    Speed
  21. 21G
    gemma-4-26b-a4b-it
    ReasoningToolsJSON+131.2 intel · $0.060/M · 519ms ttft
    41 t/s
    Speed
  22. 22G
    gemma-3n-e4b-it
    $0.060/M · 275ms ttft · 33K ctx
    32 t/s
    Speed
  23. 23G
    gemma-3-12b-it
    ToolsJSONVision8.8 intel · $0.050/M · 632ms ttft
    25 t/s
    Speed
  24. 24G
    gemma-3-4b-it
    JSONVision6.3 intel · $0.050/M · 492ms ttft
    21 t/s
    Speed
  25. 25G
    lyria-3-pro-preview
    JSONVisionFree/M · 6.5s ttft · 1.0M ctx
    5 t/s
    Speed

Frequently asked

What is the fastest Google model?

The fastest Google model is Nano Banana (Gemini 2.5 Flash Image) at 249 output tokens per second. Gemini 2.5 Flash Lite Preview 09-2025 (160 t/s) and Gemini 3.5 Flash (151 t/s) round out the top three.

What's a good alternative to Nano Banana (Gemini 2.5 Flash Image)?

Gemini 2.5 Flash Lite Preview 09-2025 (160 t/s) is the closest alternative on this metric, followed by Gemini 3.5 Flash (151 t/s). See the full ranking above for the tradeoffs.

How many Google models are there?

modelgrep tracks 26 Google models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by Gemini 3 Flash Preview. 25 of them qualify for this ranking.

More Google rankings

All rankings