modelgrep

Small & Fast Qwen Models

Quick answer · Updated June 2026

The small, fast Qwen model is Qwen3 32B — the efficient tier at 328 tokens/sec and $0.080 per million input tokens. It trades a few points of raw intelligence for speed and cost, the right call for high-volume, latency-sensitive work. Qwen3 Next 80B A3B Thinking (172 t/s) and Qwen3.6 35B A3B (172 t/s) round out the top three.

328 t/sSpeed
$0.080Input /M
131KContext

Compact, efficient models — the small/mini/flash/haiku tier — ranked by output speed. These trade a little raw intelligence for low cost and high throughput, which is the right tradeoff for chat, classification, extraction and other high-volume work.

  1. 1Q
    qwen3-32b
    ReasoningToolsJSON$0.080/M · 321ms ttft · 131K ctx
    328 t/s
    Speed
  2. 2Q
    qwen3-next-80b-a3b-thinking
    ReasoningToolsJSON26.7 intel · $0.098/M · 252ms ttft
    172 t/s
    Speed
  3. 3Q
    qwen3.6-35b-a3b
    ReasoningToolsJSON+131.5 intel · $0.150/M · 180ms ttft
    172 t/s
    Speed
  4. 4Q
    qwen3.5-35b-a3b
    ReasoningToolsJSON+130.7 intel · $0.140/M · 150ms ttft
    153 t/s
    Speed
  5. 5Q
    qwen3-vl-8b-thinking
    ReasoningToolsJSON+116.7 intel · $0.117/M · 508ms ttft
    139 t/s
    Speed
  6. 6Q
    qwen3-coder-next
    ToolsJSON28.3 intel · $0.110/M · 636ms ttft
    111 t/s
    Speed
  7. 7Q
    qwen3.6-flash
    ReasoningToolsJSON+1$0.188/M · 872ms ttft · 1M ctx
    109 t/s
    Speed
  8. 8Q
    qwen3-30b-a3b-thinking-2507
    ReasoningToolsJSON22.4 intel · $0.080/M · 374ms ttft
    95 t/s
    Speed
  9. 9Q
    qwen3-30b-a3b-instruct-2507
    ToolsJSON15.0 intel · $0.048/M · 274ms ttft
    91 t/s
    Speed
  10. 10Q
    qwen3-30b-a3b
    ReasoningToolsJSON15.3 intel · $0.120/M · 279ms ttft
    91 t/s
    Speed
  11. 11Q
    qwen3.5-flash-02-23
    ReasoningToolsJSON+1$0.065/M · 642ms ttft · 1M ctx
    77 t/s
    Speed
  12. 12Q
    qwen3-next-80b-a3b-instruct:free
    ToolsJSON20.1 intel · Free/M · 583ms ttft
    76 t/s
    Speed
  13. 13Q
    qwen3-next-80b-a3b-instruct
    ToolsJSON20.1 intel · $0.090/M · 583ms ttft
    76 t/s
    Speed
  14. 14Q
    qwen3.6-27b
    ReasoningToolsJSON+137.1 intel · $0.288/M · 531ms ttft
    76 t/s
    Speed
  15. 15Q
    qwen3.5-9b
    ReasoningToolsJSON+132.4 intel · $0.100/M · 370ms ttft
    75 t/s
    Speed
  16. 16Q
    qwen-2.5-7b-instruct
    $0.040/M · 405ms ttft · 131K ctx
    73 t/s
    Speed
  17. 17Q
    qwen3-coder-30b-a3b-instruct
    ToolsJSON20.0 intel · $0.070/M · 983ms ttft
    69 t/s
    Speed
  18. 18Q
    qwen3-vl-30b-a3b-thinking
    ReasoningToolsJSON+119.7 intel · $0.130/M · 480ms ttft
    69 t/s
    Speed
  19. 19Q
    qwen3-14b
    ReasoningToolsJSON16.2 intel · $0.100/M · 349ms ttft
    66 t/s
    Speed
  20. 20Q
    qwen-plus-2025-07-28:thinking
    ReasoningToolsJSON$0.260/M · 504ms ttft · 1M ctx
    63 t/s
    Speed
  21. 21Q
    qwen-plus-2025-07-28
    ToolsJSON$0.260/M · 504ms ttft · 1M ctx
    63 t/s
    Speed
  22. 22Q
    qwen3-vl-8b-instruct
    ToolsJSONVision14.3 intel · $0.080/M · 455ms ttft
    60 t/s
    Speed
  23. 23Q
    qwen3.5-27b
    ReasoningToolsJSON+137.2 intel · $0.195/M · 868ms ttft
    54 t/s
    Speed
  24. 24Q
    qwen3-coder-flash
    ToolsJSON$0.195/M · 1.4s ttft · 1M ctx
    40 t/s
    Speed

Frequently asked

What is the smallest, fastest Qwen model?

The small, fast Qwen model is Qwen3 32B — the efficient tier at 328 tokens/sec and $0.080 per million input tokens. It trades a few points of raw intelligence for speed and cost, the right call for high-volume, latency-sensitive work. Qwen3 Next 80B A3B Thinking (172 t/s) and Qwen3.6 35B A3B (172 t/s) round out the top three.

What's a good alternative to Qwen3 32B?

Qwen3 Next 80B A3B Thinking (172 t/s) is the closest alternative on this metric, followed by Qwen3.6 35B A3B (172 t/s). See the full ranking above for the tradeoffs.

How many Qwen models are there?

modelgrep tracks 49 Qwen models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by Qwen3.7 Max. 24 of them qualify for this ranking.

More Qwen rankings

All rankings