GPT-5.4 is the best OpenAI model for coding, with a 57.2 Artificial Analysis Coding Index across benchmarks like SWE-bench and SciCode. GPT-5.5 (56.2) and GPT-5.3-Codex (53.1) round out the top three.
AI models ranked by the Artificial Analysis Coding Index, measuring real-world software engineering ability across benchmarks like SWE-bench, SciCode and terminal tasks. The best LLMs for code generation, debugging and agentic development.
GPT-5.4 is the best OpenAI model for coding, with a 57.2 Artificial Analysis Coding Index across benchmarks like SWE-bench and SciCode. GPT-5.5 (56.2) and GPT-5.3-Codex (53.1) round out the top three.
GPT-5.5 (56.2) is the closest alternative on this metric, followed by GPT-5.3-Codex (53.1). See the full ranking above for the tradeoffs.
modelgrep tracks 62 OpenAI models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by GPT-5.4. 25 of them qualify for this ranking.