LLMs with the Largest Context Windows

Context window determines how much text a model can see at once — conversation history, documents, code, and your prompt all count against it. A larger window is not always necessary, but when it is, hitting the limit is painful.

largest context window

Llama 4 Scout

Meta · 10M tokens · by a wide margin

Context window tiers

10M+Entire codebases, book-length documents, massive multi-turn sessions. Nothing practical requires more than this today.

1M–5MVery long documents, large repos, extended research sessions. More than sufficient for almost any real-world task.

400K–1MLong documents and large codebases. Handles most enterprise document processing workloads.

128K–400KStandard frontier range. Handles most tasks — long PDFs, full files, extended conversations.

Under 128KSufficient for most chat and coding tasks. Noticeable limits on very large documents.

Llama 4 ScoutMeta

10M tokens

7.5

quality

Intelligence: AA 38.5 est. · API: $0.11/1M blendedFull review →

Grok 4.1xAI

2M tokens

8.0

quality

Intelligence: AA 41.4 · API: $6.00/1M blendedFull review →

Gemini 3 ProGoogle

1M tokens

8.8

quality

Intelligence: AA 48.4 · API: $4.50/1M blendedFull review →

Gemini 3 FlashGoogle

1M tokens

7.3

quality

Intelligence: AA 35.0 · API: $1.13/1M blendedFull review →

Llama 4 MaverickMeta

1M tokens

4.4

quality

Intelligence: AA 18.0 · API: $0.44/1M blendedFull review →

GPT-5.2OpenAI

400K tokens

8.3

quality

Intelligence: AA 46.6 · API: $4.81/1M blendedFull review →

GPT-5 miniOpenAI

400K tokens

7.3

quality

Intelligence: AA 39.0 · API: $0.69/1M blendedFull review →

Mistral Large 3Mistral

256K tokens

4.6

quality

Intelligence: AA 23.0 · API: $0.75/1M blendedFull review →

Claude Sonnet 4.6Anthropic

200K tokens

8.0

quality

Intelligence: AA 44.3 · API: $6.00/1M blendedFull review →

Claude Opus 4.6Anthropic

200K tokens

7.5

quality

Intelligence: AA 46.0 · API: $10.00/1M blendedFull review →

DeepSeek V3.2DeepSeek

128K tokens

6.3

quality

Intelligence: AA 41.6 · API: $0.48/1M blendedFull review →

Things to know about context windows

Performance degrades at the edges. Most models are less reliable at retrieving information buried deep in a very long context. A 200K context window used at 190K capacity is not the same as a 200K window used at 20K.

Long context costs more. You pay per token. A 1M context filled to capacity costs significantly more than a 128K context. Gemini 3 Pro charges 2× for prompts over 200K tokens.

Most tasks don't need more than 128K. Unless you are processing entire books, full codebases, or very long conversation histories, any model on this list will handle your workload. Context window is a tiebreaker — not the primary selection criterion for most users.