Llama 4 Scout

Open Source

7.5

out of 10

Meta's open-source flagship has a 10 million token context window — by a wide margin the largest of any model available. The weights are free to download under Meta's Llama 4 license, but running it costs compute. Via Groq it's among the cheapest options at ~$0.11/1M tokens.

Context window

10.0M tokens

API (blended)

$0.11/1M

Consumer access

API only

Multimodal

Yes

Try Llama 4 Scout Compare

Strengths

+10M token context window — nothing else comes close
+Open weights — download free, no per-token license fee to Meta
+Very cheap API access via Groq (~$0.11/1M vs $4.50–$6.00 for frontier closed models)
+Natively multimodal (text + images)
+Extremely fast on Groq (~180 t/s output)

Weaknesses

-Not truly free — compute always costs money (GPU or hosted API)
-GPQA Diamond benchmark (57.2%) significantly below frontier closed-source models
-No official Meta chat UI — must use third-party providers
-Self-hosting needs an H100 — high barrier for individuals

Best for

open-source deploymentlong-context tasksprivacy-sensitive workcost-conscious developers

Not ideal for

users who need a polished chat interfacecutting-edge reasoning tasks

Pricing details

API pricing

Groqfree tierFastest hosted inference (~180 t/s). Free tier available — rate-limited by requests/min and daily tokens. Not suitable for production workloads on free tier.

$0.11/$0.11

Together AIfree tier$25 free credits on signup.

$0.18/$0.18

OpenRouterRoutes to cheapest available provider. Price may vary.

$0.1/$0.1

Self-hostedDownload weights free from Meta (Llama 4 Community License). Requires minimum single H100 (80GB). You pay your own cloud/hardware costs — typically $2–$8/hr on cloud GPU.

Self-hosted

Prices verified February 2026. LLM pricing changes frequently — verify at the provider's site before budgeting.

Last updated: February 2026