modelgrep

Free LLM API

Quick answer · Updated June 2026

modelgrep's API is a free, no-key JSON endpoint for querying 300+ large language models by benchmark score, output speed, latency, price and capability. Call GET https://modelgrep.com/api/v1/models — no signup, CORS-enabled, and the same data that powers the site's rankings.

One JSON endpoint for the whole LLM landscape — benchmark scores, live throughput and latency, token pricing and capabilities for 300+ models on OpenRouter. No API key, no signup, CORS-enabled. It serves the exact data behind modelgrep's rankings and comparisons, so you can build your own model picker, leaderboard or cost report on top of it.

Quickstart

No setup. Hit the base URL https://modelgrep.com/api/v1 and parse JSON.

curl
curl "https://modelgrep.com/api/v1/models?sort=intelligence&limit=5"
JavaScript
const res = await fetch("https://modelgrep.com/api/v1/models?sort=coding&limit=5");
const { data } = await res.json();
console.log(data.map((m) => m.id));
Python
import requests
r = requests.get("https://modelgrep.com/api/v1/models", params={"sort": "price_input", "free": 1})
for m in r.json()["data"]:
    print(m["id"], m["pricing"]["input"])

List models

GET/api/v1/models
Filter, sort and paginate the full catalog. All parameters are optional.
qSearch id + name (space-separated terms, OR-matched).
makerMaker slug — anthropic, openai, google, meta-llama, …
providerServing-provider substring.
free1 → only $0 input-price models.
benchmarked1 → only models that have benchmark data.
capabilitiesComma list: tools, reasoning, vision, structured, audio_in, image_out.
max_priceMax input price, USD per million tokens.
min_contextMinimum context window, in tokens.
sortintelligence · coding · agentic · design · throughput · latency · price_input · price_output · context · created
orderasc | desc (sensible default per field).
limit1–200 (default 50).
offsetPagination offset (default 0).

Example response

{
  "data": [
    {
      "id": "google/gemini-2.5-flash-lite",
      "name": "Google: Gemini 2.5 Flash Lite",
      "maker": "google",
      "context_length": 1048576,
      "pricing": { "input": 0.1, "output": 0.4, "unit": "usd_per_million_tokens" },
      "performance": { "throughput_tps": 320.5, "latency_ms": 410, "uptime": 99.9 },
      "capabilities": { "vision": true, "tools": true, "reasoning": false, ... },
      "benchmarks": { "artificial_analysis": { "intelligence": 46.2, ... }, "design_arena": null },
      "url": "https://modelgrep.com/models/google/gemini-2.5-flash-lite"
    }
  ],
  "meta": { "total": 23, "count": 5, "limit": 5, "offset": 0, "has_more": true, "next_offset": 5 }
}

Single model

GET/api/v1/models/{id}
One model with full benchmark detail and the per-provider breakdown (pricing, quant, context and uptime by provider). Model ids contain a slash — pass them as-is.
curl "https://modelgrep.com/api/v1/models/anthropic/claude-sonnet-4.5"

Rankings — best LLM for X

GET/api/v1/rankings/{collection}
The same ranked, answer-first lists that power the /best pages — each response includes a one-sentence answer. Optionally scope to a maker: /api/v1/rankings/{collection}/{maker}.
# Best Anthropic model for coding
curl "https://modelgrep.com/api/v1/rankings/coding/anthropic"

Collections: small · smartest · coding · design · fastest · lowest-latency · cheapest · free · reasoning · vision · agents · open-source · long-context

Makers

GET/api/v1/makers
Every model maker with model counts and its smartest, cheapest and fastest model — handy for building a maker filter against /models.

Frequently asked questions

Is there a free LLM API?

Yes. The modelgrep API is free and requires no API key or signup. It exposes benchmarks, live speed and latency, pricing and capabilities for 300+ models. Responses are cached for about an hour; please keep request volume reasonable.

How do I get LLM benchmark data through an API?

Call GET https://modelgrep.com/api/v1/models. Each model includes its Artificial Analysis Intelligence, Coding and Agentic index scores plus Design Arena Elo under the benchmarks field. Sort by any of them with ?sort=intelligence (or coding, agentic, design).

Does the modelgrep API require an API key or authentication?

No. There is no key, token or signup. CORS is open (Access-Control-Allow-Origin: *), so you can call it directly from a browser, a serverless function or a script.

How do I find the cheapest or fastest LLM programmatically?

Use the sort parameter — ?sort=price_input for the cheapest input pricing, or ?sort=throughput for the fastest output. You can also hit the ranked endpoints GET /api/v1/rankings/cheapest and /api/v1/rankings/fastest, which return a pre-ranked list and a one-sentence answer.

Can I compare AI model prices and context windows via the API?

Yes. Every model returns a pricing object (input, output and cache-read in USD per million tokens) and context_length in tokens, alongside capabilities and per-provider pricing on the single-model endpoint.

What are the rate limits?

There is no hard rate limit, but it is a free service on fair-use terms. Responses are cached roughly an hour at the edge, so repeated identical requests are cheap. Cache aggressively on your side rather than polling.

Is this the official OpenRouter API?

No. modelgrep is an independent project, not affiliated with OpenRouter. It aggregates and enriches public data from OpenRouter, Artificial Analysis and Design Arena into one consistent, read-only JSON shape.

Notes

Explore the data