Lowest-Latency StepFun Models

Quick answer · Updated June 2026

Step 3.5 Flash has the lowest latency of any StepFun model, responding in about 450ms to first token. Step 3.7 Flash (1.4s) is next.

450msLatency

37.8Intelligence

47 t/sSpeed

$0.090Input /M

262KContext

AI models ranked by time-to-first-token (p50). The most responsive large language models for real-time and interactive use cases.

1S
step-3.5-flash
ReasoningToolsJSON37.8 intel · $0.090/M · 47 t/s
450ms
Latency
2S
step-3.7-flash
ReasoningToolsJSON+142.6 intel · $0.200/M · 92 t/s
1.4s
Latency

Frequently asked

Which StepFun model has the lowest latency?

Step 3.5 Flash has the lowest latency of any StepFun model, responding in about 450ms to first token. Step 3.7 Flash (1.4s) is next.

What's a good alternative to Step 3.5 Flash?

Step 3.7 Flash (1.4s) is the closest alternative on this metric. See the full ranking above for the tradeoffs.

How many StepFun models are there?

modelgrep tracks 2 StepFun models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by Step 3.7 Flash. 2 of them qualify for this ranking.

More StepFun rankings

StepFun: Smartest LLMs StepFun: Best LLMs for Coding StepFun: Best LLMs for Design & Frontend StepFun: Fastest LLMs StepFun: Cheapest LLMs StepFun: Best Free LLMs StepFun: Best Reasoning LLMs StepFun: Best Vision LLMs StepFun: Best LLMs for Agents StepFun: Best Open-Source LLMs StepFun: Longest-Context LLMs

All rankings

Small & Fast LLMs Smartest LLMs Best LLMs for Coding Best LLMs for Design & Frontend Fastest LLMs Cheapest LLMs Best Free LLMs Best Reasoning LLMs Best Vision LLMs Best LLMs for Agents Best Open-Source LLMs Longest-Context LLMs