Skip to content
TopInsight .co
Two abstract pricing graph visualisations side by side — warm amber (Anthropic) and cool teal (OpenAI) — in dark void editorial.

Anthropic vs OpenAI API pricing: the actual math at typical coding workloads

Both API providers iterated pricing through 2025. The honest "which is cheaper" answer depends entirely on workload shape. Here is the working math.

C Charles Lin ·

API pricing comparisons usually publish the per-million-token rates and call it analysis. Real pricing depends on what you’re doing — and on which models you’re routing to. After modelling actual spend across multiple workloads on both providers, here is the working pricing math.

The headline rates in mid-2025

Anthropic Claude:

ModelInput $/MOutput $/MContext
Claude 3.7 Sonnet$3.00$15.00200K
Claude 3.5 Haiku$0.80$4.00200K
Claude 3 Opus (legacy)$15.00$75.00200K

OpenAI:

ModelInput $/MOutput $/MContext
GPT-4o$2.50$10.00128K
GPT-4o-mini$0.15$0.60128K
GPT-4.1 (newer)variesvaries1M
o1 (reasoning)$15.00$60.00200K
o1-mini$3.00$12.00128K

The raw rates favour OpenAI on the cheap tier (GPT-4o-mini at $0.15/$0.60 is cheaper than Claude Haiku) and Anthropic on the premium tier (Sonnet at $3/$15 is cheaper than GPT-4o for the typical input-light, output-heavy coding workload).

The dirty secret: token ratios matter more than rates

For typical coding workloads, the input:output ratio is roughly 3:1 — you send a lot of context (existing code, prompts, examples) and the model writes less code back. The headline output rate dominates total cost.

Worked example: 1M tokens in, 300K out:

  • Claude 3.7 Sonnet: 1M × $3 + 0.3M × $15 = $7.50
  • GPT-4o: 1M × $2.50 + 0.3M × $10 = $5.50
  • Claude Haiku: 1M × $0.80 + 0.3M × $4 = $2.00
  • GPT-4o-mini: 1M × $0.15 + 0.3M × $0.60 = $0.33

GPT-4o is cheaper than Sonnet here. GPT-4o-mini is dramatically cheaper than Haiku.

But — and this matters — the cheaper-tier models produce more tokens and lower-quality output. The cost-per-completed-task is not the same as cost-per-token.

Where Anthropic actually wins on cost

Task completion rate on hard tasks. Claude 3.7 Sonnet finishes complex multi-file refactors more often than GPT-4o in my testing. A “$5 task that needs 2 retries on GPT-4o” can be a “$7 task that completes first try on Claude.” The retry math flips the headline rate.

Prompt caching. Anthropic ships prompt caching that reduces input cost by ~90% for repeated context. For workflows that re-send the same system prompt + tool definitions repeatedly (i.e., every coding session), this matters a lot. OpenAI has equivalent now but the discount math is similar.

Output predictability. Claude tends to be more concise than GPT-4o for coding output. Lower output token count = lower total cost.

Where OpenAI actually wins on cost

Bulk / volume workloads. GPT-4o-mini at $0.15/$0.60 is cheaper than anything Anthropic offers. For automation, content processing, batch work — OpenAI wins.

Reasoning-heavy tasks at scale. The o-series pricing is similar to Anthropic’s Sonnet but reasoning models often complete tasks Sonnet would need multiple iterations on.

Longer context for cheap. GPT-4.1 has 1M context at competitive pricing. Anthropic’s 200K is workable but smaller.

The community signal

The 358-upvote r/ChatGPTCoding “Value of $200/month AI users” thread reflects the heavy-user reality: serious coding users are spending $100-300/month total across both providers + subscription tiers. The cost question isn’t “which is cheaper per token” — it’s “which combination delivers the value at the right total cost.”

The 118-upvote “Anthropic is lagging on cheap fast models” thread captures the real Anthropic weakness — Haiku is competitive but not cheap-tier-dominant. OpenAI’s mini tier is the best in class for bulk work.

The 148-upvote “PSA for anyone using Cursor — wasting most of your AI requests” thread is the canonical “you’re paying for tokens you don’t need” piece — pointing out that wrapping AI in IDE tools often burns tokens on things that don’t need premium models.

The working multi-provider pattern

What heavy users actually do in 2025:

  1. Claude 3.7 Sonnet via Anthropic API or Claude Pro for default coding work
  2. GPT-4o-mini via OpenAI API for bulk / cheap routing (the cheapest credible “good enough” model)
  3. DeepSeek V3 via DeepSeek API for cost-conscious bulk where China-hosting is acceptable
  4. OpenAI o-series for hard reasoning when speed isn’t the goal
  5. Gemini 2.5 Pro for long-context analysis

Total monthly spend for a heavy user: $50-200/month across providers. The pattern wins because each model handles what it’s best at — no single-provider strategy lands as well.

The pricing tier comparison

Picking by your specific workload

Pros

  • Default coding driver → Claude 3.7 Sonnet (best quality-per-dollar at the premium tier)
  • Bulk / automation → GPT-4o-mini (cheapest credible quality)
  • Hard reasoning → OpenAI o3-mini (cheaper o-series option)
  • Long context → Gemini 2.5 Pro (2M context cheap)
  • Cost-optimised bulk → DeepSeek V3 (cheapest, China-hosted)
  • Prompt-caching workflows → Anthropic (mature caching at the premium tier)

Cons

  • Don’t single-provider lock-in if you can avoid it — multi-model routing is the optimal pattern
  • Don’t over-route at low volume — switching providers has overhead
  • Don’t use o-series for routine work — slow and expensive for simple tasks
  • Don’t use Haiku for cheap-bulk — GPT-4o-mini is cheaper at similar quality for that tier
  • Don’t use Sonnet for bulk — overpriced for routine work
  • Don’t ignore prompt caching — for tooling that resends context, it’s a 5-10× cost reduction

How to actually optimise

For an engineer running $50-100/month on AI APIs:

  1. Audit your token usage — most providers offer breakdowns. Find the workload chunks that dominate cost.
  2. Route those to cheaper models — bulk to GPT-4o-mini, reasoning to o3-mini, default to Sonnet
  3. Enable prompt caching where supported
  4. Use the subscription paths (Claude Pro, ChatGPT Pro) for personal use — often cheaper than equivalent API spend for individual users

The 30% savings is achievable with two hours of routing setup. Worth it.

For the broader model landscape, see our Claude vs GPT vs Gemini comparison. For the cheap-tier deep dive, DeepSeek V3 review.

Sources

Every reference behind this piece. If we make a claim, it's because at least one of these said so — or we lived it ourselves.

  1. Firsthand Modelled API spend across multiple workloads on both providers
  2. Docs Anthropic API pricing — Anthropic
  3. Docs OpenAI API pricing — OpenAI
  4. Blog r/ChatGPTCoding — value of $200/month AI users (358 ups) — r/ChatGPTCoding
  5. Blog r/ChatGPTCoding — Anthropic lagging on cheap fast models (118 ups) — r/ChatGPTCoding
  6. Blog r/ChatGPTCoding — Cursor users wasting AI requests (148 ups) — r/ChatGPTCoding
  7. YouTube Independent API pricing breakdowns — Various