Google: Gemma 3 4B

google/gemma-3-4b-it

Cheaper than 95% of paidJSONVision

Use via OpenRouter ↗

Intelligence

—

Design Elo

—

Speed

—

tokens/sec

Latency

—

first token

Input price

$0.050

18th cheapest

Context

131K

16K max out

How it compares

Cheaper than95%

of all ranked models

Overview

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities,...

Providers & pricing (1)

Provider	In $/M	Out $/M	Context	Uptime
DeepInfrabf16	$0.050	$0.100	131K	100%

Specifications

Context window131K

Max output16K

Knowledge cutoffAug 2024

Input modalitiestext, image

Output modalitiestext

Prompt caching—

Cache read price—

ModeratedNo

Open weightsgoogle/gemma-3-4b-it ↗

Gemma 3 4B FAQ

How much does Gemma 3 4B cost?

Gemma 3 4B costs $0.050 per million input tokens and $0.100 per million output tokens via OpenRouter, making it 18th cheapest of 332 paid models.

What is Gemma 3 4B's context window?

Gemma 3 4B supports a 131K-token context window and can output up to 16K tokens. It accepts text, image input.

More from google

All Gemma 3 4B alternatives →

google/gemini-3.6-flash

Intel 50.1$1.50/M

google/gemini-3.6-flash:batch

Intel 50.1$0.750/M

google/gemini-3.5-flash-lite

Intel 36.5$0.300/M

google/gemini-3.5-flash-lite:batch

Intel 36.5$0.150/M

google/gemini-3.1-flash-lite-image

Intel —$0.250/M

google/gemini-3.1-flash-image

Intel —$0.500/M

Compare head-to-head

Google: Gemma 3 4B vs Google: Gemini 3.6 Flash Google: Gemma 3 4B vs Google: Gemini 3.6 Flash (batch)Google: Gemma 3 4B vs Google: Gemini 3.5 Flash Lite Google: Gemma 3 4B vs Google: Gemini 3.5 Flash Lite (batch)Google: Gemma 3 4B vs Google: Nano Banana 2 Lite (Gemini 3.1 Flash Lite Image)Google: Gemma 3 4B vs Google: Nano Banana 2 (Gemini 3.1 Flash Image)