modelgrep

Best Text-to-Speech Models

Quick answer · Updated July 2026

The best text to speech model is Gemini 3.1 Flash Tts, rated 1215 Elo (#1 of 88) in the Speech Arena by human preference. 1 of the 4 models below carry arena rankings.

1215Elo
#1 of 88Arena rank
4Tracked

AI voice and text-to-speech models ranked by human-preference Elo from the Artificial Analysis Speech Arena.

  1. 1
    Gemini 3.1 Flash Tts
    fal-ai/gemini-3.1-flash-ttscommercial
    1215
    Elo

All text to speech endpoints (3 more)

More media rankings