modelgrep

Fastest Perplexity Models

Quick answer · Updated June 2026

The fastest Perplexity model is Sonar at 94 output tokens per second. Sonar Pro (88 t/s) and Sonar Pro Search (33 t/s) round out the top three.

94 t/sSpeed
$1.00Input /M
127KContext

AI models ranked by output speed (tokens per second, p50). The fastest large language models for low-latency and high-throughput applications.

  1. 1P
    sonar
    Vision$1.00/M · 1.8s ttft · 127K ctx
    94 t/s
    Speed
  2. 2P
    sonar-pro
    Vision$3.00/M · 1.8s ttft · 200K ctx
    88 t/s
    Speed
  3. 3P
    sonar-pro-search
    ReasoningJSONVision$3.00/M · 3.8s ttft · 200K ctx
    33 t/s
    Speed
  4. 4P
    sonar-deep-research
    Reasoning$2.00/M · 38.9s ttft · 128K ctx
    30 t/s
    Speed
  5. 5P
    sonar-reasoning-pro
    ReasoningVision$2.00/M · 21.0s ttft · 128K ctx
    27 t/s
    Speed

Frequently asked

What is the fastest Perplexity model?

The fastest Perplexity model is Sonar at 94 output tokens per second. Sonar Pro (88 t/s) and Sonar Pro Search (33 t/s) round out the top three.

What's a good alternative to Sonar?

Sonar Pro (88 t/s) is the closest alternative on this metric, followed by Sonar Pro Search (33 t/s). See the full ranking above for the tradeoffs.

How many Perplexity models are there?

modelgrep tracks 5 Perplexity models with live benchmarks, speed, latency and per-provider pricing. 5 of them qualify for this ranking.

More Perplexity rankings

All rankings