Phi 4 Mini Instruct is the best Microsoft model for agents, scoring 2.7 on the Artificial Analysis Agentic Index for tool use and multi-step task completion. Phi 4 (0.0) is next.
AI models ranked by the Artificial Analysis Agentic Index — measuring multi-step tool use, planning and task completion (including Tau²-Bench). The best models for building autonomous agents and agentic workflows.
Phi 4 Mini Instruct is the best Microsoft model for agents, scoring 2.7 on the Artificial Analysis Agentic Index for tool use and multi-step task completion. Phi 4 (0.0) is next.
Phi 4 (0.0) is the closest alternative on this metric. See the full ranking above for the tradeoffs.
modelgrep tracks 3 Microsoft models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by Phi 4. 2 of them qualify for this ranking.