modelgrep

Lowest-Latency StepFun Models

Quick answer · Updated June 2026

Step 3.5 Flash has the lowest latency of any StepFun model, responding in about 450ms to first token. Step 3.7 Flash (1.4s) is next.

450msLatency
37.8Intelligence
47 t/sSpeed
$0.090Input /M
262KContext

AI models ranked by time-to-first-token (p50). The most responsive large language models for real-time and interactive use cases.

  1. 1S
    step-3.5-flash
    ReasoningToolsJSON37.8 intel · $0.090/M · 47 t/s
    450ms
    Latency
  2. 2S
    step-3.7-flash
    ReasoningToolsJSON+142.6 intel · $0.200/M · 92 t/s
    1.4s
    Latency

Frequently asked

Which StepFun model has the lowest latency?

Step 3.5 Flash has the lowest latency of any StepFun model, responding in about 450ms to first token. Step 3.7 Flash (1.4s) is next.

What's a good alternative to Step 3.5 Flash?

Step 3.7 Flash (1.4s) is the closest alternative on this metric. See the full ranking above for the tradeoffs.

How many StepFun models are there?

modelgrep tracks 2 StepFun models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by Step 3.7 Flash. 2 of them qualify for this ranking.

More StepFun rankings

All rankings