modelgrep

Fastest Arcee AI Models

Quick answer · Updated June 2026

The fastest Arcee AI model is Trinity Mini at 116 output tokens per second. Trinity Large Thinking (114 t/s) is next.

116 t/sSpeed
$0.045Input /M
131KContext

AI models ranked by output speed (tokens per second, p50). The fastest large language models for low-latency and high-throughput applications.

  1. 1A
    trinity-mini
    ReasoningToolsJSON$0.045/M · 418ms ttft · 131K ctx
    116 t/s
    Speed
  2. 2A
    trinity-large-thinking
    ReasoningToolsJSON31.9 intel · $0.220/M · 488ms ttft
    114 t/s
    Speed

Frequently asked

What is the fastest Arcee AI model?

The fastest Arcee AI model is Trinity Mini at 116 output tokens per second. Trinity Large Thinking (114 t/s) is next.

What's a good alternative to Trinity Mini?

Trinity Large Thinking (114 t/s) is the closest alternative on this metric. See the full ranking above for the tradeoffs.

How many Arcee AI models are there?

modelgrep tracks 4 Arcee AI models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by Trinity Large Thinking. 2 of them qualify for this ranking.

More Arcee AI rankings

All rankings