modelgrep
G

Google: Gemini 2.5 Flash Lite

google/gemini-2.5-flash-lite

130th smartest of 181Cheaper than 81% of paidReasoningToolsJSONVisionAudio
Use via OpenRouter ↗
Intelligence
17.6
130th of 181
Design Elo
Speed
tokens/sec
Latency
first token
Input price
$0.100
57th cheapest
Context
1.0M
66K max out

How it compares

Smarter than28%
of all ranked models
Cheaper than81%
of all ranked models

Overview

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance...

Benchmarks

independent · via OpenRouter
Artificial Analysis26th percentile
Intelligence Index
17.6
Coding Index
9.5
Agentic Index
6.1
GPQA Diamond
63%
Humanity's Last Exam
6%
SciCode
19%
Tau²-Bench (agentic)
18%

Providers & pricing (3)

ProviderIn $/MOut $/MUptime
Google$0.100$0.40099.7%
Google$0.100$0.40099.5%
Google AI Studio$0.100$0.40099.8%

Specifications

Context window1.0M
Max output66K
Knowledge cutoffJan 2025
Input modalitiestext, image, file, audio, video
Output modalitiestext
Prompt caching
Cache read price$0.010/M
ModeratedNo

Gemini 2.5 Flash Lite FAQ

How much does Gemini 2.5 Flash Lite cost?

Gemini 2.5 Flash Lite costs $0.100 per million input tokens and $0.400 per million output tokens via OpenRouter, making it 57th cheapest of 298 paid models.

How smart is Gemini 2.5 Flash Lite?

Gemini 2.5 Flash Lite scores 17.6 on the Artificial Analysis Intelligence Index, ranking 130th of 181 benchmarked models, with a GPQA Diamond score of 63%.

What is Gemini 2.5 Flash Lite's context window?

Gemini 2.5 Flash Lite supports a 1.0M-token context window and can output up to 66K tokens. It accepts text, image, file, audio, video input.

Compare head-to-head