modelgrep

Best Text-to-Video Models

Quick answer · Updated July 2026

The best text to video model is Seedance 2.0 Fast Text to Video, rated 1271 Elo (#3 of 85) in the Video Arena by human preference. Seedance 2.0 Text to Video API (1271 Elo) and Grok Imagine Video (1231 Elo) round out the top three. 8 of the 16 models below carry arena rankings.

1271Elo
#3 of 85Arena rank
16Tracked

AI video generation models ranked by human-preference Elo. Compare Veo, Sora-class, Seedance, Kling and every other text-to-video model in one place.

  1. 1
    Seedance 2.0 Fast Text to Video
    bytedance/seedance-2.0/fast/text-to-videocommercial
    1271
    Elo
  2. 2
    Seedance 2.0 Text to Video API
    bytedance/seedance-2.0/text-to-videocommercial
    1271
    Elo
  3. 3
    Grok Imagine Video
    xai/grok-imagine-video/text-to-videocommercial
    1231
    Elo
  4. 4
    Veo 3 Fast
    fal-ai/veo3/fastcommercial
    1218
    Elo
  5. 5
    Veo3.1 Lite Text to Video
    fal-ai/veo3.1/litecommercial
    1213
    Elo
  6. 6
    Veo 3.1
    fal-ai/veo3.1commercial
    1209
    Elo
  7. 7
    Veo 3.1 Fast
    fal-ai/veo3.1/fastcommercial
    1209
    Elo
  8. 8
    Sora 2
    fal-ai/sora-2/text-to-videocommercial
    1175
    Elo

All text to video endpoints (8 more)

More media rankings