modelgrep

Lowest-Latency Tencent Models

Quick answer · Updated June 2026

Hunyuan A13B Instruct has the lowest latency of any Tencent model, responding in about 1.1s to first token. Hy3 preview (4.4s) is next.

1.1sLatency
8 t/sSpeed
$0.140Input /M
131KContext

AI models ranked by time-to-first-token (p50). The most responsive large language models for real-time and interactive use cases.

  1. 1T
    hunyuan-a13b-instruct
    ReasoningJSON$0.140/M · 8 t/s · 131K ctx
    1.1s
    Latency
  2. 2T
    hy3-preview
    ReasoningTools41.9 intel · $0.063/M · 45 t/s
    4.4s
    Latency

Frequently asked

Which Tencent model has the lowest latency?

Hunyuan A13B Instruct has the lowest latency of any Tencent model, responding in about 1.1s to first token. Hy3 preview (4.4s) is next.

What's a good alternative to Hunyuan A13B Instruct?

Hy3 preview (4.4s) is the closest alternative on this metric. See the full ranking above for the tradeoffs.

How many Tencent models are there?

modelgrep tracks 2 Tencent models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by Hy3 preview. 2 of them qualify for this ranking.

More Tencent rankings

All rankings