The fastest Undi95 model is ReMM SLERP 13B at 23 output tokens per second.
AI models ranked by output speed (tokens per second, p50). The fastest large language models for low-latency and high-throughput applications.
The fastest Undi95 model is ReMM SLERP 13B at 23 output tokens per second.
modelgrep tracks 1 Undi95 models with live benchmarks, speed, latency and per-provider pricing. 1 of them qualify for this ranking.