modelgrep

Best Xiaomi Vision Models

Quick answer · Updated June 2026

MiMo-V2.5 is the best vision-capable Xiaomi model, pairing 49.0 intelligence with image and document understanding.

49.0Intelligence
44 t/sSpeed
$0.140Input /M
1.0MContext

Multimodal large language models that accept image input, ranked by intelligence. The best vision-capable AI models for understanding images, documents and charts.

  1. 1X
    mimo-v2.5
    ReasoningToolsJSON+249.0 intel · $0.140/M · 44 t/s
    49.0
    Intelligence

Frequently asked

What is the best Xiaomi model for vision?

MiMo-V2.5 is the best vision-capable Xiaomi model, pairing 49.0 intelligence with image and document understanding.

How many Xiaomi models are there?

modelgrep tracks 3 Xiaomi models with live benchmarks, speed, latency and per-provider pricing, led on intelligence by MiMo-V2.5-Pro. 1 of them qualify for this ranking.

More Xiaomi rankings

All rankings