Llama 4 Maverick

Open Source

4.4

out of 10

Meta's midweight open-source model in the Llama 4 family — larger than Scout (402B total parameters, 17B active via mixture-of-experts) with a 1M token context window and notably fast inference at 124.6 t/s. Artificial Analysis Intelligence Index scores it at 18, below frontier models, but Maverick is not designed to compete on raw reasoning. It exists for workloads where open weights + massive context + low API cost matters more than cutting-edge benchmark performance. At $0.44/1M blended via Together AI, it's one of the cheapest options for large-context production API use.

Context window

1.0M tokens

API (blended)

$0.44/1M

Consumer access

API only

Multimodal

Yes

Try Llama 4 Maverick Compare

Strengths

+1M token context window — ties Gemini models for largest open-weight context window
+Very fast at 124.6 t/s — fastest open-weight model in this comparison
+Extremely cheap: $0.44/1M blended — budget-tier pricing with large-context support
+Open weights with commercial use allowed (Llama 4 Community License)
+Multimodal: text + image input

Weaknesses

-AA Intelligence Index 18 — the lowest-scoring model in this comparison
-Knowledge cutoff August 2024 — older training data than most models here
-Llama 4 Community License has commercial restrictions for apps with >700M monthly users
-No official Meta consumer product — requires third-party providers
-Full 402B self-hosting is technically demanding

Best for

high-volume low-cost APIlong-context applications (up to 1M tokens)open-weight commercial deploymentapplications where speed and cost beat raw reasoning

Not ideal for

complex reasoning tasks (use frontier models)users who need a chat UI (no official product)accuracy-critical enterprise applications

Pricing details

API pricing

Together AIfree tier$25 free credits on signup. Fast inference at 124+ t/s.

$0.27/$0.85

OpenRouterRoutes to cheapest available provider. Price may vary.

$0.25/$0.77

Groqfree tierVery fast inference. Free tier available with rate limits.

$0.31/$0.85

Self-hostedDownload weights free (Llama 4 Community License — commercial use allowed with restrictions). 402B total model; requires large multi-GPU cluster. Active parameter MoE design means reasonable single-GPU inference for quantized versions.

Self-hosted

Prices verified February 2026. LLM pricing changes frequently — verify at the provider's site before budgeting.

Last updated: February 2026

Compare Llama 4 Maverick

Llama 4 Maverick vs Mistral Large 3We pick the other →