modelgrep
I

IBM: Granite 4.0 Micro

ibm-granite/granite-4.0-h-micro

175th smartest of 178Cheaper than 99% of paid
Use via OpenRouter ↗
Intelligence
7.7
175th of 178
Design Elo
Speed
29
239th fastest
Latency
433ms
first token
Input price
$0.017
2nd cheapest
Context
131K
131K max out

How it compares

Smarter than2%
of all ranked models
Faster than19%
of all ranked models
Cheaper than99%
of all ranked models

Overview

Granite-4.0-H-Micro is a 3B parameter from the Granite 4 family of models. These models are the latest in a series of models released by IBM. They are fine-tuned for long...

Benchmarks

independent · via OpenRouter
Artificial Analysis4th percentile
Intelligence Index
7.7
Coding Index
5.0
Agentic Index
4.2
GPQA Diamond
34%
Humanity's Last Exam
5%
SciCode
12%
Tau²-Bench (agentic)
13%

Providers & pricing (1)

ProviderIn $/MOut $/MUptime
Cloudflare$0.017$0.112100%

Specifications

Context window131K
Max output131K
Knowledge cutoff
Input modalitiestext
Output modalitiestext
Prompt caching
Cache read price
ModeratedNo

Granite 4.0 Micro FAQ

How much does Granite 4.0 Micro cost?

Granite 4.0 Micro costs $0.017 per million input tokens and $0.112 per million output tokens via OpenRouter, making it 2nd cheapest of 298 paid models.

How smart is Granite 4.0 Micro?

Granite 4.0 Micro scores 7.7 on the Artificial Analysis Intelligence Index, ranking 175th of 178 benchmarked models, with a GPQA Diamond score of 34%.

How fast is Granite 4.0 Micro?

Granite 4.0 Micro generates around 29 tokens per second with 433ms time-to-first-token (p50), the 239th fastest tracked model.

What is Granite 4.0 Micro's context window?

Granite 4.0 Micro supports a 131K-token context window and can output up to 131K tokens. It accepts text input.

Compare head-to-head