Mistral Medium 3.5 is the best Mistral model for agents, scoring 53.2 on the Artificial Analysis Agentic Index for tool use and multi-step task completion. Mistral Medium 3.1 (25.3) and Devstral 2 2512 (21.9) round out the top three.
AI models ranked by the Artificial Analysis Agentic Index — measuring multi-step tool use, planning and task completion (including Tau²-Bench). The best models for building autonomous agents and agentic workflows.
Mistral Medium 3.5 is the best Mistral model for agents, scoring 53.2 on the Artificial Analysis Agentic Index for tool use and multi-step task completion. Mistral Medium 3.1 (25.3) and Devstral 2 2512 (21.9) round out the top three.
Mistral Medium 3.1 (25.3) is the closest alternative on this metric, followed by Devstral 2 2512 (21.9). See the full ranking above for the tradeoffs.
modelgrep tracks 19 Mistral models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by Mistral Medium 3.5. 9 of them qualify for this ranking.