Granite 4.1 8B is the best IBM model for agents, scoring 10.7 on the Artificial Analysis Agentic Index for tool use and multi-step task completion. Granite 4.0 Micro (4.2) is next.
AI models ranked by the Artificial Analysis Agentic Index — measuring multi-step tool use, planning and task completion (including Tau²-Bench). The best models for building autonomous agents and agentic workflows.
Granite 4.1 8B is the best IBM model for agents, scoring 10.7 on the Artificial Analysis Agentic Index for tool use and multi-step task completion. Granite 4.0 Micro (4.2) is next.
Granite 4.0 Micro (4.2) is the closest alternative on this metric. See the full ranking above for the tradeoffs.
modelgrep tracks 2 IBM models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by Granite 4.1 8B. 2 of them qualify for this ranking.