"How much does Claude Opus 4.8 cost?" has two answers. The list price is fixed and public. Your bill is not β it can swing 5β10Γ on the same workload depending on how you structure the prompt. This guide covers both, with numbers you can reproduce.
Opus 4.8 is Anthropic's current flagship as of June 2026. It's the model the whole ClaudeFarm farm is built on, and it's the one our cost tooling is tuned for. Everything below is the API pricing we pay and verify, not a marketing table.
The list price (verified Jun 2026)
Three numbers, all per million tokens:
| Token type | Price / 1M tokens | What it is |
|---|---|---|
| Input (uncached) | $5.00 | Every token you send that isn't a cache hit |
| Cached input (cache-read) | $0.50 | Tokens served from a warm prompt cache β ~10% of input |
| Output | $25.00 | Every token the model generates |
Two facts do most of the work here. First, output is 5Γ the price of input β verbose responses are where naΓ―ve usage bleeds money. Second, cached input is one-tenth the price of fresh input. That 10Γ gap on the input side is the single biggest lever you have, and most teams leave it on the table.
Why caching is the whole game
Anthropic's prompt cache lets you mark a stable prefix β system prompt, codebase, reference docs, anything that doesn't change turn to turn β and reuse it. The mechanics that matter for cost:
- Cached reads cost ~10% of input ($0.50/M vs. $5.00/M).
- The warm window is 5 minutes. Each cache hit refreshes the timer. Go quiet for 5 minutes and the cache goes cold; the next call pays full input price to re-warm it.
- The variable tail is always full price. Your actual question, the current diff, fresh input β those are never cached, and that's fine. They're small.
The mental model that unlocks the savings: treat the context window as a persistent cache, not a one-shot prompt. Send your big stable prefix once, then fire many small queries against the warm cache inside the 5-minute window. The whole session costs about what a couple of cold queries would. This is the core idea behind the free context-budgeter skill and the paid 1M Context Cookbook β and it's exactly the pattern we cover in How to Actually Use Claude's 1M Context Window.
A worked 200k-token codebase workflow
Here's the canonical case: you load a 200,000-token codebase as a stable prefix and ask Claude Opus 4.8 questions about it. Each question is a ~1,000-token tail (your prompt), and each answer is a ~1,500-token output.
The first call is cold. Nothing is cached yet, so the whole 200k prefix is billed at the input rate:
| Component | Tokens | Rate | Cost |
|---|---|---|---|
| Prefix (codebase) β uncached | 200,000 | $5.00/M | $1.0000 |
| Tail (your question) | 1,000 | $5.00/M | $0.0050 |
| Output (the answer) | 1,500 | $25.00/M | $0.0375 |
| First call total | $1.0425 |
Every follow-up within 5 minutes is warm. The 200k prefix is now a cache hit at $0.50/M β one-tenth the price:
| Component | Tokens | Rate | Cost |
|---|---|---|---|
| Prefix (codebase) β cached | 200,000 | $0.50/M | $0.1000 |
| Tail (your question) | 1,000 | $5.00/M | $0.0050 |
| Output (the answer) | 1,500 | $25.00/M | $0.0375 |
| Warm call total | $0.1425 |
So a 10-question session against that codebase, all inside warm windows, costs $1.0425 + (9 Γ $0.1425) = $2.32. Run the same 10 questions cold β re-sending the 200k prefix at full price every time β and you'd pay 10 Γ $1.0425 = $10.43. Same work, same answers, 4.5Γ the bill for ignoring the cache.
Cold vs. cache-smart daily cost
Scale that to a working day. Say a developer or an agent fires 200 queries against the same 200k-token codebase, and the work is bursty enough that cache stays warm for most of them (a realistic 1 cold re-warm per 20 queries):
| Strategy | Cold calls | Warm calls | Daily cost |
|---|---|---|---|
| NaΓ―ve (no caching) | 200 | 0 | $208.50 |
| Cache-smart (~10 re-warms) | 10 | 190 | $37.50 |
Same 200 queries. The cache-smart setup costs ~$37.50/day vs. ~$208.50/day β an 82% cut, purely from structuring the prompt so the codebase is a cached prefix instead of fresh input on every call. Over a 20-workday month that's the difference between roughly $750 and $4,170.
Where the money actually leaks
Four mistakes account for most surprise bills on Opus 4.8:
- Letting the cache go cold. The 5-minute window resets on every hit. A workflow that fires once every 6 minutes pays full input price every single time β you get zero cache benefit. Either batch your queries into bursts or accept that you're running cold.
- Reordering the prefix. Caching is prefix-based. If you shuffle the order of your system prompt or codebase between calls, you bust the cache and re-pay full input. Keep the stable part byte-stable.
- Over-generating output. Output is $25/M β 5Γ input. A skill that returns three paragraphs where one sentence would do is quietly the most expensive thing in your stack. Constrain response length.
- Stuffing context you'll never query. A 1M-token prefix you only ask one question against is mostly wasted cold spend. If you're doing exact lookups, a SQL or vector search is cheaper than long context β that's a retrieval job, not a context job.
If you're architecting an agent or a multi-turn workflow and want to know the bill before you ship it, the free context-budgeter skill calculates cache hit rates and per-query cost from a plain-English description of your workflow. It's a complete, working ClaudeFarm crop given away so you can judge the quality of the paid ones.
The takeaway
Claude Opus 4.8 lists at $5/M input, $0.50/M cached input, $25/M output. But the list price isn't your bill. Structure your prompt so the big stable part is a cached prefix, keep your bursts inside the 5-minute warm window, and constrain output β and the same workload costs a fraction of the naΓ―ve number. We just watched a 200-query day drop from $208 to $37 on exactly that move.
Want the real math for your workflow, not our example? Compute your own cost free in the Hive Terminal β it runs the live budgeter against current Opus 4.8 rates. And if you want the patterns baked in, the 1M Context Cookbook ships 25 cache-smart recipes for $9, one payment.
New to skills? Start with our companion guide: How to Install Claude Code Skills (2026 Step-by-Step).