Best Budget LLM APIs

Models under $1.00 per million tokens (blended 3:1 input:output ratio), ranked by quality score. If you are running any meaningful volume of API calls, the cost gap between these models and frontier closed-source alternatives is enormous.

Blended cost = (input price × 3 + output price) / 4. Cheapest reliable provider used. Prices verified February 2026 — check before budgeting.

Llama 4 ScoutMetaopen weights

via Self-hosted$0/1M input · $0/1M output

All providers:

Groq: $0.11/$0.11Together AI: $0.18/$0.18OpenRouter: $0.1/$0.1

7.5

quality

$0.11

blended/1M

Get Llama 4 Scout API access →

GPT-5 miniOpenAI

via OpenAI$0.25/1M input · $2/1M output

All providers:

OpenAI: $0.25/$2OpenRouter: $0.26/$2.1

7.3

quality

$0.69

blended/1M

Get GPT-5 mini API access →

DeepSeek V3.2DeepSeekopen weights

via Self-hosted$0/1M input · $0/1M output

All providers:

DeepSeek: $0.27/$1.1OpenRouter: $0.28/$1.12Together AI: $0.3/$1.2

6.3

quality

$0.48

blended/1M

Get DeepSeek V3.2 API access →

Mistral Large 3Mistralopen weights

via Self-hosted$0/1M input · $0/1M output

All providers:

Mistral (la Plateforme): $0.5/$1.5OpenRouter: $0.52/$1.55

4.6

quality

$0.75

blended/1M

Get Mistral Large 3 API access →

Llama 4 MaverickMetaopen weights

via Self-hosted$0/1M input · $0/1M output

All providers:

Together AI: $0.27/$0.85OpenRouter: $0.25/$0.77Groq: $0.31/$0.85

4.4

quality

$0.44

blended/1M

Get Llama 4 Maverick API access →

Premium tier (above $1.00/1M blended)

Higher quality, higher cost. Shown for comparison.

Gemini 3 ProGoogle

$4.50/1M8.8

GPT-5.2OpenAI

$4.81/1M8.3

Claude Sonnet 4.6Anthropic

$6.00/1M8.0

Grok 4.1xAI

$6.00/1M8.0

Claude Opus 4.6Anthropic

$10.00/1M7.5

Gemini 3 FlashGoogle

$1.13/1M7.3

Why “blended” cost?

LLM APIs charge separately for input and output tokens. A 3:1 input:output ratio is a reasonable approximation for most real workloads (prompts are often longer than completions). The blended cost formula: (input price × 3 + output price) / 4. This makes models directly comparable on a single number. If your workload is output-heavy (long completions), multiply output price by your actual ratio for a more accurate estimate.

Prices change constantly. Always verify at the provider's official pricing page before committing to a provider for production workloads.