modelgrep
G

Google: Gemma 3 12B

google/gemma-3-12b-it

170th smartest of 178Cheaper than 93% of paidToolsJSONVision
Use via OpenRouter ↗
Intelligence
8.8
170th of 178
Design Elo
Speed
37
211th fastest
Latency
501ms
first token
Input price
$0.050
20th cheapest
Context
131K
16K max out

How it compares

Smarter than4%
of all ranked models
Faster than28%
of all ranked models
Cheaper than93%
of all ranked models

Overview

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities,...

Benchmarks

independent · via OpenRouter
Artificial Analysis6th percentile
Intelligence Index
8.8
Coding Index
6.3
Agentic Index
3.6
GPQA Diamond
35%
Humanity's Last Exam
5%
SciCode
17%
Tau²-Bench (agentic)
11%

Providers & pricing (1)

ProviderIn $/MOut $/MUptime
DeepInfrabf16$0.050$0.150100%

Specifications

Context window131K
Max output16K
Knowledge cutoffAug 2024
Input modalitiestext, image
Output modalitiestext
Prompt caching
Cache read price
ModeratedNo

Gemma 3 12B FAQ

How much does Gemma 3 12B cost?

Gemma 3 12B costs $0.050 per million input tokens and $0.150 per million output tokens via OpenRouter, making it 20th cheapest of 298 paid models.

How smart is Gemma 3 12B?

Gemma 3 12B scores 8.8 on the Artificial Analysis Intelligence Index, ranking 170th of 178 benchmarked models, with a GPQA Diamond score of 35%.

How fast is Gemma 3 12B?

Gemma 3 12B generates around 37 tokens per second with 501ms time-to-first-token (p50), the 211th fastest tracked model.

What is Gemma 3 12B's context window?

Gemma 3 12B supports a 131K-token context window and can output up to 16K tokens. It accepts text, image input.

Compare head-to-head