Claude Code /fast economics: enable it on turn one or pay a full-context toll

Chris Harper

2 min read

Jun 11, 2026 · 11:30 UTC

Developer Tools

Best Practices

A same-day-verified breakdown of Claude Code's fast mode is worth internalizing before you build a /fast habit. The headline trade is simple — same Opus model, up to 2.5x faster at 2x the token price ($10/$50 per MTok on Opus 4.8 vs $5/$25 standard) — but three billing wrinkles change how you should use it.

The first-enable toll. Fast mode adds a request header that's part of the prompt cache key. The first /fast in a conversation invalidates your cache and re-reads the entire history as uncached input at fast rates. 400K tokens deep, that toggle costs ~$4 before it does anything; on turn one, it's a cent. Enable at session start or wait for the next session. (The charge is once per conversation on v2.1.86+ — later toggles keep the cache.)

Separate money. On Pro/Max/Team plans, fast mode bills usage credits from the very first token and never touches included plan usage. Surprise line items on your credits are likely this. It also has its own rate-limit pool; hitting it degrades gracefully to standard speed.

Where it's wrong. Skip it for agents, overnight runs, and CI — nobody's watching, output tokens dominate, and fast mode isn't even available in the Batch API (the actual 50%-off lever). And if you're still pointed at Opus 4.7/4.6, fast mode there is a 6x premium ($30/$150) — switch models before switching speeds. Not available on Bedrock, Vertex, or Azure Foundry; CLI only, not the VS Code extension.

Rule of thumb: /fast for interactive debugging from turn one; standard mode for everything autonomous.

Sources: Developers Digest, Claude Code fast mode docs, Claude API pricing

CloudCodeTree

Claude Code /fast economics: enable it on turn one or pay a full-context toll