What does the OpenAI API actually cost in 2026?
The pricing pages publish rates per million tokens. That is not how anyone experiences the bill. Here is what the API actually costs in real-world usage — with the math shown, an honest estimate for the four most common patterns, and a short note on why your invoice is always higher than you thought.
OpenAI API pricing ranges from $0.15 to $60 per million tokens depending on the model and whether the tokens are input or output. GPT-4o costs $2.50 input and $10 output per million tokens. GPT-4o mini is cheapest at $0.15 / $0.60. o1 is most expensive in general use at $15 / $60.
For a typical small chatbot (1,000 input / 500 output tokens, 500 messages/day), GPT-4o costs about $112.50/month. Same workload on GPT-4o mini is about $6.75/month. Output tokens dominate the bill — always.
OpenAI API pricing in April 2026
Output tokens are priced 3–5x higher than input tokens across every model. Any cost estimate that weights input and output equally is wrong.
Anthropic API pricing in April 2026
Anthropic pricing is included for side-by-side comparison because users who search for OpenAI pricing frequently evaluate Claude as an alternative. Capped monitors both providers from the same extension.
What four real-world patterns actually cost
Per-million-token rates are hard to intuit. Per-month dollars are not. Each example below assumes a 30-day month with a flat daily request pattern.
Notice the last two rows. A reasoning-heavy agent on o1 at 200 runs/day costs ~60x more than a high-volume customer-support autoresponder on GPT-4o mini at 10x the request count. Model choice dwarfs request volume for cost.
The five costs that surprise people
Output tokens cost 3–5x input tokens
OpenAI charges $2.50 per million input tokens on GPT-4o but $10 per million output. Most informal cost estimates average the two rates. They should not. Generation-heavy workloads — summarization, long-form writing, reasoning agents — are dominated by output cost.
Reasoning models bill for hidden 'thinking' tokens
o1 and o3 models generate internal reasoning tokens before producing a visible answer. You pay for these tokens at the output rate, and they can multiply the actual output cost by 3–10x. A 500-token final answer might be backed by 5,000 tokens of reasoning billed at $60/million.
Long context adds compounding cost
Retrieval-augmented applications routinely pass 20,000+ tokens of context per request. At GPT-4o's $2.50 input rate, that's $0.05 per request in input alone — $500/month at 300 queries/day. Long-context architectures need to be designed around the token count, not added to an existing app.
Vision, structured output, and tool calls are priced separately
Image inputs have their own per-image cost. Structured outputs (JSON mode with schemas) and function/tool calling can add 10–20% overhead on top of the base cost. None of this is surfaced clearly in a simple per-token calculator.
OpenAI's 'usage limit' is a soft alert, not a hard cap
The 'usage limit' setting in your OpenAI dashboard is a notification threshold, not an enforced spending cap. The provider continues to process requests after you cross it. A genuine hard cap requires either project-scoped API keys with per-project limits, or a third-party tool that actively cuts off traffic when you exceed your budget.
Use the calculator
If you have specific numbers for your application — tokens per request, requests per day, model — plug them into the calculator on this site. It uses the same rates as this article and updates live.
Open the calculator →The three things that work
- § 1
Use project-scoped API keys. Both OpenAI and Anthropic let you create keys attached to a specific project with its own spending limit. This is the only provider-enforced hard cap.
- § 2
Track spend daily, not monthly. Providers publish Usage APIs that return current-period cost. A daily check is the difference between a $400 bill and a $4,000 one. Capped runs this check hourly from your browser with a read-only admin key.
- § 3
Set a personal cap below provider limits. Your provider billing caps protect you at their threshold. A personal cap — enforced by a notification at 80%, 100%, and 150% — gives you 20–40% of headroom to react before the provider fires.
Frequently asked
What is the cheapest OpenAI model in 2026?
GPT-4o mini is the cheapest OpenAI model in general use, at $0.15 per million input tokens and $0.60 per million output tokens. GPT-3.5 Turbo remains available at $0.50 input and $1.50 output per million tokens, but GPT-4o mini is meaningfully more capable at comparable or lower cost.
How much does GPT-4o cost per 1,000 tokens?
GPT-4o costs roughly $0.0025 per 1,000 input tokens and $0.01 per 1,000 output tokens. A typical chat message with 1,000 tokens in and 500 tokens out costs about $0.0075 — three-quarters of a cent per message.
Is the Claude API cheaper than OpenAI?
It depends on the tier. Claude Haiku 4.5 ($1 / $5 per million tokens) sits between GPT-4o mini and GPT-4o on price. Claude Sonnet 4.6 ($3 / $15) is slightly more expensive than GPT-4o ($2.50 / $10) but comparable in capability for many tasks. Claude Opus 4.7 ($15 / $75) is priced like OpenAI's o1 for frontier tasks.
Why is my OpenAI bill higher than I expected?
Three common reasons: (1) output tokens cost 3–5x more than input tokens and most usage calculators underweight this, (2) reasoning models like o1 generate hidden 'thinking' tokens you pay for, (3) long context tokens, vision inputs, and structured-output formatting each add overhead. Tracking spend in real time — not at month-end — is the only reliable defense.
How do I set a hard spending limit on OpenAI?
OpenAI's dashboard allows a soft 'usage limit' in your billing settings, but it is a notification, not an enforced cap — the provider continues charging after you cross it. For a hard cap, you need either a third-party tool that reads the Usage API and throttles or alerts in real time, or you need to cap at the provider level by using project-scoped API keys with their own limits.
How accurate are OpenAI API cost calculators?
A calculator is an estimate. Real cost varies because token counts for the same English input differ across models (different tokenizers), and because output tokens are unpredictable. Use the calculator to set a realistic cap, then use a tracker to see actual spend against that cap.
Does ChatGPT Plus count against my API spend?
No. ChatGPT Plus ($20/month subscription) is a separate product from the API. API spend is billed per-token based on the rates in this article. Using ChatGPT in a browser does not consume API credits.
Pricing reflects publicly listed per-million-token rates as of April 2026. Providers adjust frequently. This article is updated quarterly — the last review date is visible in the page metadata. Verify against the provider's own pricing pages before building on these numbers.
Worked examples assume 30-day months and flat daily request volume. Real usage is bursty; real bills have spikes. The point of the examples is to illustrate order of magnitude, not to promise a number.