modelgrep's API is a free, no-key JSON endpoint for querying 300+ large language models by benchmark score, output speed, latency, price and capability. Call GET https://modelgrep.com/api/v1/models — no signup, CORS-enabled, and the same data that powers the site's rankings.
One JSON endpoint for the whole LLM landscape — benchmark scores, live throughput and latency, token pricing and capabilities for 300+ models on OpenRouter. No API key, no signup, CORS-enabled. It serves the exact data behind modelgrep's rankings and comparisons, so you can build your own model picker, leaderboard or cost report on top of it.
No setup. Hit the base URL https://modelgrep.com/api/v1 and parse JSON.
curl "https://modelgrep.com/api/v1/models?sort=intelligence&limit=5"
const res = await fetch("https://modelgrep.com/api/v1/models?sort=coding&limit=5");
const { data } = await res.json();
console.log(data.map((m) => m.id));import requests
r = requests.get("https://modelgrep.com/api/v1/models", params={"sort": "price_input", "free": 1})
for m in r.json()["data"]:
print(m["id"], m["pricing"]["input"])/api/v1/modelsqSearch id + name (space-separated terms, OR-matched).makerMaker slug — anthropic, openai, google, meta-llama, …providerServing-provider substring.free1 → only $0 input-price models.benchmarked1 → only models that have benchmark data.capabilitiesComma list: tools, reasoning, vision, structured, audio_in, image_out.max_priceMax input price, USD per million tokens.min_contextMinimum context window, in tokens.sortintelligence · coding · agentic · design · throughput · latency · price_input · price_output · context · createdorderasc | desc (sensible default per field).limit1–200 (default 50).offsetPagination offset (default 0).Example response
{
"data": [
{
"id": "google/gemini-2.5-flash-lite",
"name": "Google: Gemini 2.5 Flash Lite",
"maker": "google",
"context_length": 1048576,
"pricing": { "input": 0.1, "output": 0.4, "unit": "usd_per_million_tokens" },
"performance": { "throughput_tps": 320.5, "latency_ms": 410, "uptime": 99.9 },
"capabilities": { "vision": true, "tools": true, "reasoning": false, ... },
"benchmarks": { "artificial_analysis": { "intelligence": 46.2, ... }, "design_arena": null },
"url": "https://modelgrep.com/models/google/gemini-2.5-flash-lite"
}
],
"meta": { "total": 23, "count": 5, "limit": 5, "offset": 0, "has_more": true, "next_offset": 5 }
}/api/v1/models/{id}curl "https://modelgrep.com/api/v1/models/anthropic/claude-sonnet-4.5"
/api/v1/rankings/{collection}answer. Optionally scope to a maker: /api/v1/rankings/{collection}/{maker}.# Best Anthropic model for coding curl "https://modelgrep.com/api/v1/rankings/coding/anthropic"
Collections: small · smartest · coding · design · fastest · lowest-latency · cheapest · free · reasoning · vision · agents · open-source · long-context
/api/v1/makers/models.Yes. The modelgrep API is free and requires no API key or signup. It exposes benchmarks, live speed and latency, pricing and capabilities for 300+ models. Responses are cached for about an hour; please keep request volume reasonable.
Call GET https://modelgrep.com/api/v1/models. Each model includes its Artificial Analysis Intelligence, Coding and Agentic index scores plus Design Arena Elo under the benchmarks field. Sort by any of them with ?sort=intelligence (or coding, agentic, design).
No. There is no key, token or signup. CORS is open (Access-Control-Allow-Origin: *), so you can call it directly from a browser, a serverless function or a script.
Use the sort parameter — ?sort=price_input for the cheapest input pricing, or ?sort=throughput for the fastest output. You can also hit the ranked endpoints GET /api/v1/rankings/cheapest and /api/v1/rankings/fastest, which return a pre-ranked list and a one-sentence answer.
Yes. Every model returns a pricing object (input, output and cache-read in USD per million tokens) and context_length in tokens, alongside capabilities and per-provider pricing on the single-model endpoint.
There is no hard rate limit, but it is a free service on fair-use terms. Responses are cached roughly an hour at the edge, so repeated identical requests are cheap. Cache aggressively on your side rather than polling.
No. modelgrep is an independent project, not affiliated with OpenRouter. It aggregates and enriches public data from OpenRouter, Artificial Analysis and Design Arena into one consistent, read-only JSON shape.
null when a model has no score — not every model is benchmarked.