modelgrep

Lowest-Latency Undi95 Models

Quick answer · Updated June 2026

ReMM SLERP 13B has the lowest latency of any Undi95 model, responding in about 896ms to first token.

896msLatency
22 t/sSpeed
$0.450Input /M
6KContext

AI models ranked by time-to-first-token (p50). The most responsive large language models for real-time and interactive use cases.

  1. 1U
    remm-slerp-l2-13b
    JSON$0.450/M · 22 t/s · 6K ctx
    896ms
    Latency

Frequently asked

Which Undi95 model has the lowest latency?

ReMM SLERP 13B has the lowest latency of any Undi95 model, responding in about 896ms to first token.

How many Undi95 models are there?

modelgrep tracks 1 Undi95 models with live benchmarks, speed, latency and per-provider pricing. 1 of them qualify for this ranking.

More Undi95 rankings

All rankings