§ Guide · No. 01April 2026

What does the OpenAI API actually cost in 2026?

The pricing pages publish rates per million tokens. That is not how anyone experiences the bill. Here is what the API actually costs in real-world usage — with the math shown, an honest estimate for the four most common patterns, and a short note on why your invoice is always higher than you thought.

§ TL;DR

OpenAI API pricing ranges from $0.15 to $60 per million tokens depending on the model and whether the tokens are input or output. GPT-4o costs $2.50 input and $10 output per million tokens. GPT-4o mini is cheapest at $0.15 / $0.60. o1 is most expensive in general use at $15 / $60.

For a typical small chatbot (1,000 input / 500 output tokens, 500 messages/day), GPT-4o costs about $112.50/month. Same workload on GPT-4o mini is about $6.75/month. Output tokens dominate the bill — always.

§ OpenAI pricing · per 1M tokens

OpenAI API pricing in April 2026

ModelInput / 1MOutput / 1M

GPT-4o$2.50$10.00

GPT-4o mini$0.15$0.60

o1$15.00$60.00

o1 mini$3.00$12.00

o3 mini$1.10$4.40

GPT-4 Turbo$10.00$30.00

GPT-3.5 Turbo$0.50$1.50

Output tokens are priced 3–5x higher than input tokens across every model. Any cost estimate that weights input and output equally is wrong.

§ Anthropic pricing · per 1M tokens

Anthropic API pricing in April 2026

ModelInput / 1MOutput / 1M

Claude Opus 4.7$15.00$75.00

Claude Sonnet 4.6$3.00$15.00

Claude Haiku 4.5$1.00$5.00

Anthropic pricing is included for side-by-side comparison because users who search for OpenAI pricing frequently evaluate Claude as an alternative. Capped monitors both providers from the same extension.

§ Worked examples

What four real-world patterns actually cost

Per-million-token rates are hard to intuit. Per-month dollars are not. Each example below assumes a 30-day month with a flat daily request pattern.

A small chatbot

1,000 tokens in, 500 tokens out per message. 500 messages per day. GPT-4o.

$112.50 /mo

A RAG application

8,000 tokens in (retrieved context), 800 tokens out. 300 queries per day. GPT-4o.

$252.00 /mo

A reasoning-heavy agent

3,000 tokens in, 4,000 tokens out per run. 200 runs per day. o1.

$1710.00 /mo

A customer support autoresponder

2,000 tokens in, 400 tokens out. 2,000 conversations per day. GPT-4o mini.

$32.40 /mo

Notice the last two rows. A reasoning-heavy agent on o1 at 200 runs/day costs ~60x more than a high-volume customer-support autoresponder on GPT-4o mini at 10x the request count. Model choice dwarfs request volume for cost.

§ What the pricing page does not tell you

The five costs that surprise people

§ I

Output tokens cost 3–5x input tokens

OpenAI charges $2.50 per million input tokens on GPT-4o but $10 per million output. Most informal cost estimates average the two rates. They should not. Generation-heavy workloads — summarization, long-form writing, reasoning agents — are dominated by output cost.

§ II

Reasoning models bill for hidden 'thinking' tokens

o1 and o3 models generate internal reasoning tokens before producing a visible answer. You pay for these tokens at the output rate, and they can multiply the actual output cost by 3–10x. A 500-token final answer might be backed by 5,000 tokens of reasoning billed at $60/million.

§ III

Long context adds compounding cost

Retrieval-augmented applications routinely pass 20,000+ tokens of context per request. At GPT-4o's $2.50 input rate, that's $0.05 per request in input alone — $500/month at 300 queries/day. Long-context architectures need to be designed around the token count, not added to an existing app.

§ IV

Vision, structured output, and tool calls are priced separately

Image inputs have their own per-image cost. Structured outputs (JSON mode with schemas) and function/tool calling can add 10–20% overhead on top of the base cost. None of this is surfaced clearly in a simple per-token calculator.

§ V

OpenAI's 'usage limit' is a soft alert, not a hard cap

The 'usage limit' setting in your OpenAI dashboard is a notification threshold, not an enforced spending cap. The provider continues to process requests after you cross it. A genuine hard cap requires either project-scoped API keys with per-project limits, or a third-party tool that actively cuts off traffic when you exceed your budget.

§ Estimating your specific usage

Use the calculator

If you have specific numbers for your application — tokens per request, requests per day, model — plug them into the calculator on this site. It uses the same rates as this article and updates live.

Open the calculator →

§ How to actually avoid the surprise

The three things that work

§ 1
Use project-scoped API keys. Both OpenAI and Anthropic let you create keys attached to a specific project with its own spending limit. This is the only provider-enforced hard cap.
§ 2
Track spend daily, not monthly. Providers publish Usage APIs that return current-period cost. A daily check is the difference between a $400 bill and a $4,000 one. Capped runs this check hourly from your browser with a read-only admin key.
§ 3
Set a personal cap below provider limits. Your provider billing caps protect you at their threshold. A personal cap — enforced by a notification at 80%, 100%, and 150% — gives you 20–40% of headroom to react before the provider fires.

Install Capped for Chrome Or try the calculator

§ FAQ

Frequently asked

What is the cheapest OpenAI model in 2026?

GPT-4o mini is the cheapest OpenAI model in general use, at $0.15 per million input tokens and $0.60 per million output tokens. GPT-3.5 Turbo remains available at $0.50 input and $1.50 output per million tokens, but GPT-4o mini is meaningfully more capable at comparable or lower cost.

How much does GPT-4o cost per 1,000 tokens?

GPT-4o costs roughly $0.0025 per 1,000 input tokens and $0.01 per 1,000 output tokens. A typical chat message with 1,000 tokens in and 500 tokens out costs about $0.0075 — three-quarters of a cent per message.

Is the Claude API cheaper than OpenAI?

It depends on the tier. Claude Haiku 4.5 ($1 / $5 per million tokens) sits between GPT-4o mini and GPT-4o on price. Claude Sonnet 4.6 ($3 / $15) is slightly more expensive than GPT-4o ($2.50 / $10) but comparable in capability for many tasks. Claude Opus 4.7 ($15 / $75) is priced like OpenAI's o1 for frontier tasks.

Why is my OpenAI bill higher than I expected?

Three common reasons: (1) output tokens cost 3–5x more than input tokens and most usage calculators underweight this, (2) reasoning models like o1 generate hidden 'thinking' tokens you pay for, (3) long context tokens, vision inputs, and structured-output formatting each add overhead. Tracking spend in real time — not at month-end — is the only reliable defense.

How do I set a hard spending limit on OpenAI?

OpenAI's dashboard allows a soft 'usage limit' in your billing settings, but it is a notification, not an enforced cap — the provider continues charging after you cross it. For a hard cap, you need either a third-party tool that reads the Usage API and throttles or alerts in real time, or you need to cap at the provider level by using project-scoped API keys with their own limits.

How accurate are OpenAI API cost calculators?

A calculator is an estimate. Real cost varies because token counts for the same English input differ across models (different tokenizers), and because output tokens are unpredictable. Use the calculator to set a realistic cap, then use a tracker to see actual spend against that cap.

Does ChatGPT Plus count against my API spend?

No. ChatGPT Plus ($20/month subscription) is a separate product from the API. API spend is billed per-token based on the rates in this article. Using ChatGPT in a browser does not consume API credits.

§ Related reading

Why is my OpenAI bill so high? 7 common causes

Diagnostic guide for surprise invoices — output token pricing, reasoning-model hidden costs, retry loops, and more.

→

How to reduce OpenAI API costs in 2026

Eight tactics ranked by ROI — model selection, caching, Batch API (50% discount), hard spending caps.

→

OpenAI & Anthropic API Cost Calculator

Estimate your specific workload with live pricing.

→

§ Methodology & sources

Pricing reflects publicly listed per-million-token rates as of April 2026. Providers adjust frequently. This article is updated quarterly — the last review date is visible in the page metadata. Verify against the provider's own pricing pages before building on these numbers.

Worked examples assume 30-day months and flat daily request volume. Real usage is bursty; real bills have spikes. The point of the examples is to illustrate order of magnitude, not to promise a number.