modelgrep
G

Google: Gemini 3.1 Flash Lite

google/gemini-3.1-flash-lite

ReasoningToolsJSONVisionAudio
Use via OpenRouter ↗
Intelligence
Design Elo
Speed
104
55th fastest
Latency
623ms
first token
Input price
$0.250
107th cheapest
Context
1.0M
66K max out

How it compares

Faster than82%
of all ranked models
Cheaper than64%
of all ranked models

Overview

Gemini 3.1 Flash Lite is Google’s GA high-efficiency multimodal model optimized for low-latency, high-volume workloads. It supports text, image, video, audio, and PDF inputs, and is designed for lightweight agentic...

Providers & pricing (2)

ProviderIn $/MOut $/MUptime
Googlecache$0.250$1.5096.8%
Google AI Studiocache$0.250$1.5099.5%

Specifications

Context window1.0M
Max output66K
Knowledge cutoff
Input modalitiestext, image, video, file, audio
Output modalitiestext
Prompt cachingSupported
Cache read price$0.025/M
ModeratedNo

Gemini 3.1 Flash Lite FAQ

How much does Gemini 3.1 Flash Lite cost?

Gemini 3.1 Flash Lite costs $0.250 per million input tokens and $1.50 per million output tokens via OpenRouter, making it 107th cheapest of 298 paid models.

How fast is Gemini 3.1 Flash Lite?

Gemini 3.1 Flash Lite generates around 104 tokens per second with 623ms time-to-first-token (p50), the 55th fastest tracked model.

What is Gemini 3.1 Flash Lite's context window?

Gemini 3.1 Flash Lite supports a 1.0M-token context window and can output up to 66K tokens. It accepts text, image, video, file, audio input.

Compare head-to-head