Meta
Llama 4 Scout
Open Source7.5
out of 10
Meta's open-source flagship has a 10 million token context window — by a wide margin the largest of any model available. The weights are free to download under Meta's Llama 4 license, but running it costs compute. Via Groq it's among the cheapest options at ~$0.11/1M tokens.
Context window
10.0M tokens
API (blended)
$0.11/1M
Consumer access
API only
Multimodal
Yes
Strengths
- +10M token context window — nothing else comes close
- +Open weights — download free, no per-token license fee to Meta
- +Very cheap API access via Groq (~$0.11/1M vs $4.50–$6.00 for frontier closed models)
- +Natively multimodal (text + images)
- +Extremely fast on Groq (~180 t/s output)
Weaknesses
- -Not truly free — compute always costs money (GPU or hosted API)
- -GPQA Diamond benchmark (57.2%) significantly below frontier closed-source models
- -No official Meta chat UI — must use third-party providers
- -Self-hosting needs an H100 — high barrier for individuals
Best for
open-source deploymentlong-context tasksprivacy-sensitive workcost-conscious developers
Not ideal for
users who need a polished chat interfacecutting-edge reasoning tasks
Pricing details
API pricing
Groqfree tierFastest hosted inference (~180 t/s). Free tier available — rate-limited by requests/min and daily tokens. Not suitable for production workloads on free tier.
$0.11/$0.11Together AIfree tier$25 free credits on signup.
$0.18/$0.18OpenRouterRoutes to cheapest available provider. Price may vary.
$0.1/$0.1Self-hostedDownload weights free from Meta (Llama 4 Community License). Requires minimum single H100 (80GB). You pay your own cloud/hardware costs — typically $2–$8/hr on cloud GPU.
Self-hostedPrices verified February 2026. LLM pricing changes frequently — verify at the provider's site before budgeting.
Last updated: February 2026