LLM Cost Calculator - Compare GPT, Claude, Gemini Pricing

LLM API Cost Calculator

Paste your prompt, pick your knobs, see the exact cost across every major model. Accounts for cached-input pricing, batch-mode discounts, and context-window limits.

Your Prompt

Input Tokens

auto from text above

Output Tokens

expected response size

Calls / Month

scale multiplier

Baseline

for savings column

Input 1,200

Output 500

Total 1,700 tokens

× 1 call

Discounts & Mode

Cache hit % 0%

Batch API (50% off, async)

Cache hit: OpenAI (50% off) & Anthropic (90% off) charge reduced rates for cached input tokens (prefix reuse on multi-turn or RAG). Batch: OpenAI & Anthropic both offer 50% off for non-realtime batch processing with 24h turnaround.

Model	Context	Input / Output per 1M	Cost / Call	Monthly	vs Baseline

Pricing source: Official provider pages, verified April 2026. Rates are US dollars per 1,000,000 tokens. Cache hit % applies to input tokens only (OpenAI 50% off cached, Anthropic 90% off cache reads). Batch mode applies 50% off both input and output on OpenAI & Anthropic endpoints that support it. Self-hosted Llama rates are rough cloud-GPU inference estimates, not a direct API price. Rows highlighted in red exceed the model's context window and aren't usable for this prompt size. For production billing, always double-check with the provider's official calculator — volume discounts and enterprise rates are not modeled here.

About This Tool

Estimate the cost of API calls to language models based on token count. Compare pricing across models like GPT-4, Claude, and Gemini to budget your AI application costs before you build.

How to Use

Enter or paste the text you plan to send as a prompt.
Select the AI model to check pricing against.
See the estimated cost per request and at different usage volumes.

Frequently Asked Questions

What is a token in AI language models?

A token is roughly 3/4 of a word. The word 'hamburger' is two tokens. Spaces and punctuation also count. Most models charge per 1,000 or 1 million tokens.

Which AI model is cheapest per token?

Pricing changes frequently. Generally, smaller models (GPT-4o-mini, Claude Haiku, Gemini Flash) cost 10-50x less than flagship models for simpler tasks.