Sonar has the lowest latency of any Perplexity model, responding in about 1.6s to first token. Sonar Pro (2.2s) and Sonar Pro Search (4.1s) round out the top three.
AI models ranked by time-to-first-token (p50). The most responsive large language models for real-time and interactive use cases.
Sonar has the lowest latency of any Perplexity model, responding in about 1.6s to first token. Sonar Pro (2.2s) and Sonar Pro Search (4.1s) round out the top three.
Sonar Pro (2.2s) is the closest alternative on this metric, followed by Sonar Pro Search (4.1s). See the full ranking above for the tradeoffs.
modelgrep tracks 5 Perplexity models with live benchmarks, speed, latency and per-provider pricing. 5 of them qualify for this ranking.