Laguna XS.2 (free) has the lowest latency of any Poolside model, responding in about 457ms to first token. Laguna M.1 (free) (2.3s) is next.
AI models ranked by time-to-first-token (p50). The most responsive large language models for real-time and interactive use cases.
Laguna XS.2 (free) has the lowest latency of any Poolside model, responding in about 457ms to first token. Laguna M.1 (free) (2.3s) is next.
Laguna M.1 (free) (2.3s) is the closest alternative on this metric. See the full ranking above for the tradeoffs.
modelgrep tracks 2 Poolside models with live benchmarks, speed, latency and per-provider pricing. 2 of them qualify for this ranking.