[good]

Meta

Llama 4 Scout

Open Source
7.5
out of 10

Meta's open-source flagship has a 10 million token context window — by a wide margin the largest of any model available. The weights are free to download under Meta's Llama 4 license, but running it costs compute. Via Groq it's among the cheapest options at ~$0.11/1M tokens.

Context window

10.0M tokens

API (blended)

$0.11/1M

Consumer access

API only

Multimodal

Yes

Strengths

  • +10M token context window — nothing else comes close
  • +Open weights — download free, no per-token license fee to Meta
  • +Very cheap API access via Groq (~$0.11/1M vs $4.50–$6.00 for frontier closed models)
  • +Natively multimodal (text + images)
  • +Extremely fast on Groq (~180 t/s output)

Weaknesses

  • -Not truly free — compute always costs money (GPU or hosted API)
  • -GPQA Diamond benchmark (57.2%) significantly below frontier closed-source models
  • -No official Meta chat UI — must use third-party providers
  • -Self-hosting needs an H100 — high barrier for individuals

Best for

open-source deploymentlong-context tasksprivacy-sensitive workcost-conscious developers

Not ideal for

users who need a polished chat interfacecutting-edge reasoning tasks

Pricing details

API pricing

Groqfree tierFastest hosted inference (~180 t/s). Free tier available — rate-limited by requests/min and daily tokens. Not suitable for production workloads on free tier.
$0.11/$0.11
Together AIfree tier$25 free credits on signup.
$0.18/$0.18
OpenRouterRoutes to cheapest available provider. Price may vary.
$0.1/$0.1
Self-hostedDownload weights free from Meta (Llama 4 Community License). Requires minimum single H100 (80GB). You pay your own cloud/hardware costs — typically $2–$8/hr on cloud GPU.
Self-hosted

Prices verified February 2026. LLM pricing changes frequently — verify at the provider's site before budgeting.

Last updated: February 2026