Function calling (tool use)

A model capability for emitting structured calls to external tools or APIs — the foundation of agents, retrieval and anything that acts on the world.

Function calling (a.k.a. tool use) lets a model output a structured request — a function name plus JSON arguments — instead of plain text. Your code runs the function, returns the result, and the model continues. It's how an LLM checks the weather, queries a database, or runs a calculation it can't do reliably itself.

Reliable tool use is the single hardest requirement for agents, which chain dozens of calls and must pick the right tool with the right arguments every time. The Artificial Analysis Agentic Index and Tau²-Bench exist specifically to measure this.

Not every model supports it, and support quality varies widely — a model can be brilliant at prose yet unreliable at emitting valid tool calls. Filter for the tool-calling capability when building anything agentic.

Best LLMs for agents →Reasoning models →

More terms

Context window →Tokens per second (throughput) →Time to first token (latency) →Artificial Analysis Intelligence Index →GPQA (Diamond) →Elo rating (for LLMs) →