[good]

Meta

Llama 4 Maverick

Open Source
4.4
out of 10

Meta's midweight open-source model in the Llama 4 family — larger than Scout (402B total parameters, 17B active via mixture-of-experts) with a 1M token context window and notably fast inference at 124.6 t/s. Artificial Analysis Intelligence Index scores it at 18, below frontier models, but Maverick is not designed to compete on raw reasoning. It exists for workloads where open weights + massive context + low API cost matters more than cutting-edge benchmark performance. At $0.44/1M blended via Together AI, it's one of the cheapest options for large-context production API use.

Context window

1.0M tokens

API (blended)

$0.44/1M

Consumer access

API only

Multimodal

Yes

Strengths

  • +1M token context window — ties Gemini models for largest open-weight context window
  • +Very fast at 124.6 t/s — fastest open-weight model in this comparison
  • +Extremely cheap: $0.44/1M blended — budget-tier pricing with large-context support
  • +Open weights with commercial use allowed (Llama 4 Community License)
  • +Multimodal: text + image input

Weaknesses

  • -AA Intelligence Index 18 — the lowest-scoring model in this comparison
  • -Knowledge cutoff August 2024 — older training data than most models here
  • -Llama 4 Community License has commercial restrictions for apps with >700M monthly users
  • -No official Meta consumer product — requires third-party providers
  • -Full 402B self-hosting is technically demanding

Best for

high-volume low-cost APIlong-context applications (up to 1M tokens)open-weight commercial deploymentapplications where speed and cost beat raw reasoning

Not ideal for

complex reasoning tasks (use frontier models)users who need a chat UI (no official product)accuracy-critical enterprise applications

Pricing details

API pricing

Together AIfree tier$25 free credits on signup. Fast inference at 124+ t/s.
$0.27/$0.85
OpenRouterRoutes to cheapest available provider. Price may vary.
$0.25/$0.77
Groqfree tierVery fast inference. Free tier available with rate limits.
$0.31/$0.85
Self-hostedDownload weights free (Llama 4 Community License — commercial use allowed with restrictions). 402B total model; requires large multi-GPU cluster. Active parameter MoE design means reasonable single-GPU inference for quantized versions.
Self-hosted

Prices verified February 2026. LLM pricing changes frequently — verify at the provider's site before budgeting.

Last updated: February 2026