modelgrep

Longest-Context LLMs

Quick answer · Updated June 2026

Llama 4 Scout has the largest context window of any LLM, at 10M tokens. Grok 4.20 Multi-Agent (2M) and Grok 4.20 (2M) round out the top three.

10MContext
13.5Intelligence
112 t/sSpeed
$0.100Input /M

AI models with the largest context windows, ranked by token capacity. The best large language models for long documents, codebases and extended conversations.

  1. 1M
    llama-4-scout
    ToolsJSONVision13.5 intel · $0.100/M · 112 t/s
    10M
    Context
  2. 2X
    grok-4.20-multi-agent
    ReasoningJSONVision$2.00/M · 355 t/s · 21.7s ttft
    2M
    Context
  3. 3X
    grok-4.20
    ReasoningToolsJSON+129.7 intel · $1.25/M · 99 t/s
    2M
    Context
  4. 4O
    gpt-5.5-pro
    ReasoningToolsJSON+1$30.00/M · 45 t/s · 49.0s ttft
    1.1M
    Context
  5. 5O
    gpt-5.5
    ReasoningToolsJSON+156.7 intel · $5.00/M · 37 t/s
    1.1M
    Context
  6. 6O
    gpt-5.4-pro
    ReasoningToolsJSON+1$30.00/M · 2 t/s · 4.4s ttft
    1.1M
    Context
  7. 7O
    gpt-5.4
    ReasoningToolsJSON+135.4 intel · $2.50/M · 59 t/s
    1.1M
    Context
  8. 8G
    gemini-3.1-pro-preview-customtools
    ReasoningToolsJSON+2$2.00/M · 61 t/s · 3.4s ttft
    1.0M
    Context
  9. 9M
    minimax-m3
    ReasoningToolsJSON+154.7 intel · $0.300/M · 47 t/s
    1.0M
    Context
  10. 10G
    gemini-3.5-flash
    ReasoningToolsJSON+243.3 intel · $1.50/M · 164 t/s
    1.0M
    Context
  11. 11G
    gemini-3.1-flash-lite
    ReasoningToolsJSON+2$0.250/M · 104 t/s · 623ms ttft
    1.0M
    Context
  12. 12D
    deepseek-v4-pro
    ReasoningToolsJSON39.3 intel · $0.435/M · 62 t/s
    1.0M
    Context
  13. 13D
    deepseek-v4-flash
    ReasoningToolsJSON46.0 intel · $0.098/M · 79 t/s
    1.0M
    Context
  14. 14X
    mimo-v2.5-pro
    ReasoningToolsJSON53.8 intel · $0.435/M · 32 t/s
    1.0M
    Context
  15. 15X
    mimo-v2.5
    ReasoningToolsJSON+249.0 intel · $0.140/M · 49 t/s
    1.0M
    Context
  16. 16G
    lyria-3-pro-preview
    JSONVisionFree/M · 15 t/s · 11.1s ttft
    1.0M
    Context
  17. 17G
    lyria-3-clip-preview
    JSONVisionFree/M · 8 t/s · 3.4s ttft
    1.0M
    Context
  18. 18G
    gemini-3.1-flash-lite-preview
    ReasoningToolsJSON+233.5 intel · $0.250/M · 95 t/s
    1.0M
    Context
  19. 19G
    gemini-3.1-pro-preview
    ReasoningToolsJSON+241.3 intel · $2.00/M · 100 t/s
    1.0M
    Context
  20. 20G
    gemini-3-flash-preview
    ReasoningToolsJSON+246.4 intel · $0.500/M · 66 t/s
    1.0M
    Context
  21. 21G
    gemini-2.5-flash-lite-preview-09-2025
    ReasoningToolsJSON+219.4 intel · $0.100/M · 202 t/s
    1.0M
    Context
  22. 22Q
    qwen3-coder:free
    Tools24.8 intel · Free/M · 45 t/s
    1.0M
    Context
  23. 23Q
    qwen3-coder
    ToolsJSON24.8 intel · $0.220/M · 45 t/s
    1.0M
    Context
  24. 24G
    gemini-2.5-flash-lite
    ReasoningToolsJSON+217.6 intel · $0.100/M · 117 t/s
    1.0M
    Context
  25. 25G
    gemini-2.5-flash
    ReasoningToolsJSON+2$0.300/M · 86 t/s · 676ms ttft
    1.0M
    Context

Frequently asked

Which LLM has the largest context window?

Llama 4 Scout has the largest context window of any LLM, at 10M tokens. Grok 4.20 Multi-Agent (2M) and Grok 4.20 (2M) round out the top three.

What's a good alternative to Llama 4 Scout?

Grok 4.20 Multi-Agent (2M) is the closest alternative on this metric, followed by Grok 4.20 (2M). See the full ranking above for the tradeoffs.

By maker

All rankings