modelgrep

Best LLMs for Agents

Quick answer · Updated June 2026

Claude Fable 5 is the best LLM for agents, scoring 80.6 on the Artificial Analysis Agentic Index for tool use and multi-step task completion. Claude Opus 4.8 (77.8) and Claude Opus 4.7 (71.3) round out the top three.

80.6Agentic
64.9Intelligence
$10.00Input /M
1MContext

AI models ranked by the Artificial Analysis Agentic Index — measuring multi-step tool use, planning and task completion (including Tau²-Bench). The best models for building autonomous agents and agentic workflows.

  1. 1A
    claude-fable-5
    ReasoningToolsJSON+164.9 intel · $10.00/M · 1M ctx
    80.6
    Agentic
  2. 2A
    claude-opus-4.8
    ReasoningToolsJSON+161.4 intel · $5.00/M · 58 t/s
    77.8
    Agentic
  3. 3A
    claude-opus-4.7
    ReasoningToolsJSON+157.3 intel · $5.00/M · 60 t/s
    71.3
    Agentic
  4. 4O
    gpt-5.5
    ReasoningToolsJSON+156.7 intel · $5.00/M · 37 t/s
    69.4
    Agentic
  5. 5M
    minimax-m3
    ReasoningToolsJSON+154.7 intel · $0.300/M · 47 t/s
    68.6
    Agentic
  6. 6A
    claude-opus-4.6
    ReasoningToolsJSON+152.9 intel · $5.00/M · 46 t/s
    67.6
    Agentic
  7. 7X
    mimo-v2.5-pro
    ReasoningToolsJSON53.8 intel · $0.435/M · 32 t/s
    67.4
    Agentic
  8. 8Q
    qwen3.7-max
    ReasoningToolsJSON56.6 intel · $1.25/M · 39 t/s
    66.6
    Agentic
  9. 9Z
    glm-5-turbo
    ReasoningToolsJSON46.8 intel · $1.20/M · 15 t/s
    66.1
    Agentic
  10. 10Z
    glm-5.1
    ReasoningToolsJSON43.8 intel · $0.980/M · 93 t/s
    66.0
    Agentic
  11. 11X
    grok-4.3
    ReasoningToolsJSON+153.2 intel · $1.25/M · 135 t/s
    65.9
    Agentic
  12. 12X
    mimo-v2.5
    ReasoningToolsJSON+249.0 intel · $0.140/M · 49 t/s
    65.5
    Agentic
  13. 13Q
    qwen3.7-plus
    ReasoningToolsJSON+153.3 intel · $0.320/M · 29 t/s
    65.1
    Agentic
  14. 14Q
    qwen3.6-max-preview
    ReasoningToolsJSON51.8 intel · $1.04/M · 48 t/s
    64.8
    Agentic
  15. 15D
    deepseek-v4-pro
    ReasoningToolsJSON39.3 intel · $0.435/M · 62 t/s
    63.3
    Agentic
  16. 16Z
    glm-5
    ReasoningToolsJSON49.8 intel · $0.600/M · 112 t/s
    63.1
    Agentic
  17. 17D
    deepseek-v4-flash
    ReasoningToolsJSON46.0 intel · $0.098/M · 79 t/s
    62.3
    Agentic
  18. 18Q
    qwen3.6-plus
    ReasoningToolsJSON+150.0 intel · $0.325/M · 36 t/s
    61.7
    Agentic
  19. 19M
    minimax-m2.7
    ReasoningToolsJSON49.6 intel · $0.250/M · 334 t/s
    61.5
    Agentic
  20. 20Q
    qwen3.6-27b
    ReasoningToolsJSON+137.1 intel · $0.288/M · 76 t/s
    60.9
    Agentic
  21. 21O
    gpt-5.3-codex
    ReasoningToolsJSON+153.6 intel · $1.75/M · 45 t/s
    60.5
    Agentic
  22. 22S
    step-3.7-flash
    ReasoningToolsJSON+142.6 intel · $0.200/M · 76 t/s
    59.5
    Agentic
  23. 23A
    claude-opus-4.5
    ReasoningToolsJSON+143.1 intel · $5.00/M · 51 t/s
    59.2
    Agentic
  24. 24M
    kimi-k2.6
    ReasoningToolsJSON+142.9 intel · $0.680/M · 116 t/s
    58.7
    Agentic
  25. 25A
    claude-sonnet-4.6
    ReasoningToolsJSON+142.6 intel · $3.00/M · 48 t/s
    57.5
    Agentic

Frequently asked

What is the best LLM for agents?

Claude Fable 5 is the best LLM for agents, scoring 80.6 on the Artificial Analysis Agentic Index for tool use and multi-step task completion. Claude Opus 4.8 (77.8) and Claude Opus 4.7 (71.3) round out the top three.

What's a good alternative to Claude Fable 5?

Claude Opus 4.8 (77.8) is the closest alternative on this metric, followed by Claude Opus 4.7 (71.3). See the full ranking above for the tradeoffs.

By maker

All rankings