modelgrep
M

Meta: Llama Guard 4 12B

meta-llama/llama-guard-4-12b

JSONVision
Use via OpenRouter ↗
Intelligence
Design Elo
Speed
18
268th fastest
Latency
120ms
first token
Input price
$0.180
90th cheapest
Context
164K
16K max out

How it compares

Faster than9%
of all ranked models
Cheaper than70%
of all ranked models

Overview

Llama Guard 4 is a Llama 4 Scout-derived multimodal pretrained model, fine-tuned for content safety classification. Similar to previous versions, it can be used to classify content in both LLM...

Providers & pricing (2)

ProviderIn $/MOut $/MUptime
DeepInfrabf16$0.180$0.180100%
Together$0.200$0.200100%

Specifications

Context window164K
Max output16K
Knowledge cutoffAug 2024
Input modalitiesimage, text
Output modalitiestext
Prompt caching
Cache read price
ModeratedNo

Llama Guard 4 12B FAQ

How much does Llama Guard 4 12B cost?

Llama Guard 4 12B costs $0.180 per million input tokens and $0.180 per million output tokens via OpenRouter, making it 90th cheapest of 298 paid models.

How fast is Llama Guard 4 12B?

Llama Guard 4 12B generates around 18 tokens per second with 120ms time-to-first-token (p50), the 268th fastest tracked model.

What is Llama Guard 4 12B's context window?

Llama Guard 4 12B supports a 164K-token context window and can output up to 16K tokens. It accepts image, text input.

Compare head-to-head