Lowest-Latency Deep Cogito Models

Quick answer · Updated June 2026

Cogito v2.1 671B has the lowest latency of any Deep Cogito model, responding in about 350ms to first token.

350msLatency

27 t/sSpeed

$1.25Input /M

128KContext

AI models ranked by time-to-first-token (p50). The most responsive large language models for real-time and interactive use cases.

1D
cogito-v2.1-671b
ReasoningJSON$1.25/M · 27 t/s · 128K ctx
350ms
Latency

Frequently asked

Which Deep Cogito model has the lowest latency?

Cogito v2.1 671B has the lowest latency of any Deep Cogito model, responding in about 350ms to first token.

How many Deep Cogito models are there?

modelgrep tracks 1 Deep Cogito models with live benchmarks, speed, latency and per-provider pricing. 1 of them qualify for this ranking.

More Deep Cogito rankings

Deep Cogito: Smartest LLMs Deep Cogito: Best LLMs for Coding Deep Cogito: Best LLMs for Design & Frontend Deep Cogito: Fastest LLMs Deep Cogito: Cheapest LLMs Deep Cogito: Best Free LLMs Deep Cogito: Best Reasoning LLMs Deep Cogito: Best Vision LLMs Deep Cogito: Best LLMs for Agents Deep Cogito: Best Open-Source LLMs Deep Cogito: Longest-Context LLMs

All rankings

Small & Fast LLMs Smartest LLMs Best LLMs for Coding Best LLMs for Design & Frontend Fastest LLMs Cheapest LLMs Best Free LLMs Best Reasoning LLMs Best Vision LLMs Best LLMs for Agents Best Open-Source LLMs Longest-Context LLMs