Lowest-Latency AI21 Models

Quick answer · Updated June 2026

Jamba Large 1.7 has the lowest latency of any AI21 model, responding in about 717ms to first token.

717msLatency

10.9Intelligence

15 t/sSpeed

$2.00Input /M

256KContext

AI models ranked by time-to-first-token (p50). The most responsive large language models for real-time and interactive use cases.

1A
jamba-large-1.7
ToolsJSON10.9 intel · $2.00/M · 15 t/s
717ms
Latency

Frequently asked

Which AI21 model has the lowest latency?

Jamba Large 1.7 has the lowest latency of any AI21 model, responding in about 717ms to first token.

How many AI21 models are there?

modelgrep tracks 1 AI21 models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by Jamba Large 1.7. 1 of them qualify for this ranking.

More AI21 rankings

AI21: Smartest LLMs AI21: Best LLMs for Coding AI21: Best LLMs for Design & Frontend AI21: Fastest LLMs AI21: Cheapest LLMs AI21: Best Free LLMs AI21: Best Reasoning LLMs AI21: Best Vision LLMs AI21: Best LLMs for Agents AI21: Best Open-Source LLMs AI21: Longest-Context LLMs

All rankings

Small & Fast LLMs Smartest LLMs Best LLMs for Coding Best LLMs for Design & Frontend Fastest LLMs Cheapest LLMs Best Free LLMs Best Reasoning LLMs Best Vision LLMs Best LLMs for Agents Best Open-Source LLMs Longest-Context LLMs