Why Your Claude Code Context Window Fills Up
Claude Code's 200K-token context fills faster than most developers expect. Here's what actually consumes it — and how to keep long sessions cheap and productive.
Quick answer
/context to see how full your window is, then /compact or /clear to free space. Run npx whoburnedmore to see how much total token spend the context ate across all your sessions. 🔥The context window is not a storage cap in the traditional sense — it is a cost multiplier. Every token currently in context is re-sent with every new message, so a 150K-token context makes each turn cost 150K tokens in input before you even type your question. Long sessions do not just run out of space; they get progressively more expensive per message until they hit the 200K ceiling and stop altogether.
What fills your context window?
Most developers assume their context is dominated by their own typed messages. In practice, typed messages are usually the smallest contributor. The big consumers are things Claude does on your behalf: reading files, running tools, and accumulating prior turn outputs. Here is how a typical long session fills up:
File reads accumulate silently
When Claude reads a file — either because you asked it to or because it decided the file was relevant — the full file content goes into context. A single 500-line TypeScript file is roughly 6–8K tokens. If Claude reads 10 different files over the course of a session, that is 60–80K tokens of static content re-sent with every subsequent message. This is the single largest context consumer in a typical coding session.
Tool call results pile up
Shell command outputs, test results, and search results all appear in context as tool call responses. A find . -type f on a large monorepo can return thousands of lines. A failing test suite output can be 10K tokens on its own. These are appended verbatim, and unlike user messages, they tend to be long and repetitive.
A large CLAUDE.md is always present
Your project's CLAUDE.md file is loaded into context at the start of every session and stays there. A detailed CLAUDE.md with full coding conventions, file layout, environment variables, and decision history can easily run 2–5K tokens. That amount is small in isolation, but it is always there, and it compounds with everything else.
Each message pays for the entire context, not just the new content
If your context holds 120K tokens and you type a 50-token question, you are billed for 120,050 tokens of input. This means context bloat directly drives your token consumption and your rate-limit utilization. A 150K context is 3× more expensive per message than a 50K context. ⛔How do I see how full my context window is?
Claude Code exposes context state through the /context slash command. It shows the current context size as a token count and a percentage of the 200K ceiling, broken down by type where possible.
$ claude /context Context window status ───────────────────────────────────────── Total in context 143,280 tokens Window capacity 200,000 tokens Utilization ████████████░░░ 72% Breakdown file reads 67,400 tokens tool outputs 38,100 tokens conversation turns 28,600 tokens system / CLAUDE.md 9,180 tokens
At 72% context fill, every message you send costs at least 143K input tokens before Claude types a word. Two or three more large file reads would push this session past the ceiling. This is the moment to decide: compact now, or finish the current task and clear before starting the next one.
How do I free up context space?
Claude Code provides two built-in mechanisms for resetting or shrinking context. They serve different situations:
/compact — summarize and continue
Claude summarizes the conversation history into a condensed form, discarding the raw turn-by-turn content but preserving the key decisions and code state. You stay in the same session, and Claude retains context about what you were doing. Use this when you want to keep working on the same task.
/clear — full reset
Wipes the context entirely and starts a new session. All conversation history is gone; only your CLAUDE.md and the files you explicitly re-read will be in context for the next message. Use this between distinct tasks when you do not need continuity from the previous session.
When to compact vs when to clear
Use /compact when you are mid-task and Claude still needs to remember what you were doing — for example, halfway through a multi-file refactor. Use /clear when you are switching to a completely different task or codebase, or when you want to be certain no stale file content is polluting the new session.
$ claude /compact ↳ summarizing context (143K → ~18K tokens)… Context compacted. Summary: refactoring auth module — completed login.ts, session.ts in progress, token.ts pending. Ready for next message. New context size: 18,200 tokens (9% of window)
How does tracking total token spend reveal context bloat?
Context bloat is invisible until you measure it. A session that should have cost 500K input tokens — based on the meaningful work done — might actually cost 3M tokens because of context re-attachment. The difference is almost entirely context overhead.
Run npx whoburnedmore after a session and compare the per-day input token total against the amount of output code and answers you actually received. If the ratio looks wrong — say, 8M input tokens for a few hundred lines of generated code — context overhead is the culprit.
$ npx whoburnedmore ↳ scanning claude code logs… TODAY input tokens 9.8M output tokens 0.4M input/output 24.5× ← context bloat likely estimated cost $38.20 typical ratio: 5–10× input to output 24× suggests large repeated file reads in context
A healthy Claude Code session typically runs a 5–10× ratio of input to output tokens. A ratio above 15× is a reliable indicator of context overhead. When you see that number, the fix is almost always earlier and more aggressive use of /compact.
- 1
Start each new task with /clear or a fresh session
Never carry context from one unrelated task to the next. Old file reads and tool outputs from a previous task will silently inflate the cost of every new message. - 2
Run /context at the start of expensive tasks
If you see context above 50%, compact before starting a task that involves reading many more files. Compacting now costs one round-trip; not compacting means paying for 100K+ extra tokens on every subsequent turn. - 3
Keep your CLAUDE.md lean
Every token in CLAUDE.md is paid on every message of every session. A 5K-token CLAUDE.md that you never read manually costs 5K tokens per message — 500K tokens over a 100-message session. Review it periodically and remove stale sections. - 4
Monitor weekly spend for context-bloat patterns
Runnpx whoburnedmoreweekly. If your input-to-output ratio trends up over several days, you are letting contexts grow too large before compacting. Catching the pattern early lets you adjust the habit before it hits your rate-limit budget.
context window size
healthy input/output ratio
ratio that signals context bloat
Context cost and rate-limit cost are the same budget
Every input token in your context counts against your 5-hour rate-limit window. Cutting your context in half literally doubles how long you can work before hitting the window. This is the highest-leverage optimization available — not writing shorter prompts, but managing what stays in context. For a full breakdown of how the rate-limit budget works, see how to reduce Claude Code token usage. For what to do when context overhead has already pushed you into a usage limit, see what to do when Claude Code usage limit is reached.Related guides
How to Reduce Claude Code Token Usage
Ten ways to burn fewer tokens — starting with measuring where they actually go.
Claude Code Usage Limit Reached? Here's What to Do
Locked out mid-session? See what you used and when the limit resets.
How to Check Claude Code Token Usage
See your Claude Code tokens by day, model, and project — and how the built-in /usage compares.