modelgrep

Lowest-Latency Deep Cogito Models

Quick answer · Updated June 2026

Cogito v2.1 671B has the lowest latency of any Deep Cogito model, responding in about 350ms to first token.

350msLatency
27 t/sSpeed
$1.25Input /M
128KContext

AI models ranked by time-to-first-token (p50). The most responsive large language models for real-time and interactive use cases.

  1. 1D
    cogito-v2.1-671b
    ReasoningJSON$1.25/M · 27 t/s · 128K ctx
    350ms
    Latency

Frequently asked

Which Deep Cogito model has the lowest latency?

Cogito v2.1 671B has the lowest latency of any Deep Cogito model, responding in about 350ms to first token.

How many Deep Cogito models are there?

modelgrep tracks 1 Deep Cogito models with live benchmarks, speed, latency and per-provider pricing. 1 of them qualify for this ranking.

More Deep Cogito rankings

All rankings