Why Your Claude Code Context Window Fills Up

Claude Code's 200K-token context fills faster than most developers expect. Here's what actually consumes it — and how to keep long sessions cheap and productive.

By Arham WaniLast updated June 2026 whoburnedmore guide

Quick answer

Claude Code context window full errors occur when your accumulated conversation — including all messages, tool call results, and file reads — exceeds approximately 200,000 tokens. Each message in the session pays the cost of the entire context accumulated so far. Run /context to see how full your window is, then /compact or /clear to free space. Run npx whoburnedmore to see how much total token spend the context ate across all your sessions. 🔥

The context window is not a storage cap in the traditional sense — it is a cost multiplier. Every token currently in context is re-sent with every new message, so a 150K-token context makes each turn cost 150K tokens in input before you even type your question. Long sessions do not just run out of space; they get progressively more expensive per message until they hit the 200K ceiling and stop altogether.

What fills your context window?

Most developers assume their context is dominated by their own typed messages. In practice, typed messages are usually the smallest contributor. The big consumers are things Claude does on your behalf: reading files, running tools, and accumulating prior turn outputs. Here is how a typical long session fills up:

Share of context window consumed by source type in a typical 2-hour session (K tokens)

File reads accumulate silently

When Claude reads a file — either because you asked it to or because it decided the file was relevant — the full file content goes into context. A single 500-line TypeScript file is roughly 6–8K tokens. If Claude reads 10 different files over the course of a session, that is 60–80K tokens of static content re-sent with every subsequent message. This is the single largest context consumer in a typical coding session.

Tool call results pile up

Shell command outputs, test results, and search results all appear in context as tool call responses. A find . -type f on a large monorepo can return thousands of lines. A failing test suite output can be 10K tokens on its own. These are appended verbatim, and unlike user messages, they tend to be long and repetitive.

A large CLAUDE.md is always present

Your project's CLAUDE.md file is loaded into context at the start of every session and stays there. A detailed CLAUDE.md with full coding conventions, file layout, environment variables, and decision history can easily run 2–5K tokens. That amount is small in isolation, but it is always there, and it compounds with everything else.

Each message pays for the entire context, not just the new content

If your context holds 120K tokens and you type a 50-token question, you are billed for 120,050 tokens of input. This means context bloat directly drives your token consumption and your rate-limit utilization. A 150K context is 3× more expensive per message than a 50K context. ⛔

How do I see how full my context window is?

Claude Code exposes context state through the /context slash command. It shows the current context size as a token count and a percentage of the 200K ceiling, broken down by type where possible.

zsh — ~/webapp

$ claude /context   Context window status  ─────────────────────────────────────────  Total in context      143,280 tokens  Window capacity       200,000 tokens  Utilization           ████████████░░░  72%   Breakdown  file reads            67,400 tokens  tool outputs          38,100 tokens  conversation turns    28,600 tokens  system / CLAUDE.md     9,180 tokens

At 72% context fill, every message you send costs at least 143K input tokens before Claude types a word. Two or three more large file reads would push this session past the ceiling. This is the moment to decide: compact now, or finish the current task and clear before starting the next one.

cost_msg=(context_tokens+prompt_tokens)×price_input+output_tokens×price_output

Context cost per message scales linearly with accumulated context size.

How do I free up context space?

Claude Code provides two built-in mechanisms for resetting or shrinking context. They serve different situations:

/compact — summarize and continue

Claude summarizes the conversation history into a condensed form, discarding the raw turn-by-turn content but preserving the key decisions and code state. You stay in the same session, and Claude retains context about what you were doing. Use this when you want to keep working on the same task.

/clear — full reset

Wipes the context entirely and starts a new session. All conversation history is gone; only your CLAUDE.md and the files you explicitly re-read will be in context for the next message. Use this between distinct tasks when you do not need continuity from the previous session.

When to compact vs when to clear

Use /compact when you are mid-task and Claude still needs to remember what you were doing — for example, halfway through a multi-file refactor. Use /clear when you are switching to a completely different task or codebase, or when you want to be certain no stale file content is polluting the new session.

zsh — mid-session refactor

$ claude /compact  ↳ summarizing context (143K → ~18K tokens)…   Context compacted.  Summary: refactoring auth module — completed login.ts,  session.ts in progress, token.ts pending.  Ready for next message.   New context size: 18,200 tokens (9% of window)

How does tracking total token spend reveal context bloat?

Context bloat is invisible until you measure it. A session that should have cost 500K input tokens — based on the meaningful work done — might actually cost 3M tokens because of context re-attachment. The difference is almost entirely context overhead.

Run npx whoburnedmore after a session and compare the per-day input token total against the amount of output code and answers you actually received. If the ratio looks wrong — say, 8M input tokens for a few hundred lines of generated code — context overhead is the culprit.

zsh — any directory

$ npx whoburnedmore  ↳ scanning claude code logs…   TODAY  input tokens     9.8M  output tokens    0.4M  input/output     24.5×  ← context bloat likely  estimated cost   $38.20   typical ratio: 5–10× input to output  24× suggests large repeated file reads in context

A healthy Claude Code session typically runs a 5–10× ratio of input to output tokens. A ratio above 15× is a reliable indicator of context overhead. When you see that number, the fix is almost always earlier and more aggressive use of /compact.

1
Start each new task with /clear or a fresh session
Never carry context from one unrelated task to the next. Old file reads and tool outputs from a previous task will silently inflate the cost of every new message.
2
Run /context at the start of expensive tasks
If you see context above 50%, compact before starting a task that involves reading many more files. Compacting now costs one round-trip; not compacting means paying for 100K+ extra tokens on every subsequent turn.
3
Keep your CLAUDE.md lean
Every token in CLAUDE.md is paid on every message of every session. A 5K-token CLAUDE.md that you never read manually costs 5K tokens per message — 500K tokens over a 100-message session. Review it periodically and remove stale sections.
4
Monitor weekly spend for context-bloat patterns
Run npx whoburnedmore weekly. If your input-to-output ratio trends up over several days, you are letting contexts grow too large before compacting. Catching the pattern early lets you adjust the habit before it hits your rate-limit budget.

200K

context window size

5–10×

healthy input/output ratio

15×+

ratio that signals context bloat

Context cost and rate-limit cost are the same budget

Every input token in your context counts against your 5-hour rate-limit window. Cutting your context in half literally doubles how long you can work before hitting the window. This is the highest-leverage optimization available — not writing shorter prompts, but managing what stays in context. For a full breakdown of how the rate-limit budget works, see how to reduce Claude Code token usage. For what to do when context overhead has already pushed you into a usage limit, see what to do when Claude Code usage limit is reached.

Related guides

How to Reduce Claude Code Token Usage

Ten ways to burn fewer tokens — starting with measuring where they actually go.

Claude Code Usage Limit Reached? Here's What to Do

Locked out mid-session? See what you used and when the limit resets.

How to Check Claude Code Token Usage

See your Claude Code tokens by day, model, and project — and how the built-in /usage compares.

← Browse all whoburnedmore guides

Why Your Claude Code Context Window Fills Up

Claude Code's 200K-token context fills faster than most developers expect. Here's what actually consumes it — and how to keep long sessions cheap and productive.

By Arham WaniLast updated June 2026 whoburnedmore guide

Quick answer

What fills your context window?

Share of context window consumed by source type in a typical 2-hour session (K tokens)

File reads accumulate silently

Tool call results pile up

A large CLAUDE.md is always present

Each message pays for the entire context, not just the new content

How do I see how full my context window is?

Claude Code exposes context state through the /context slash command. It shows the current context size as a token count and a percentage of the 200K ceiling, broken down by type where possible.

zsh — ~/webapp

$ claude /context   Context window status  ─────────────────────────────────────────  Total in context      143,280 tokens  Window capacity       200,000 tokens  Utilization           ████████████░░░  72%   Breakdown  file reads            67,400 tokens  tool outputs          38,100 tokens  conversation turns    28,600 tokens  system / CLAUDE.md     9,180 tokens

cost_msg=(context_tokens+prompt_tokens)×price_input+output_tokens×price_output

Context cost per message scales linearly with accumulated context size.

How do I free up context space?

Claude Code provides two built-in mechanisms for resetting or shrinking context. They serve different situations:

/compact — summarize and continue

/clear — full reset

When to compact vs when to clear

zsh — mid-session refactor

$ claude /compact  ↳ summarizing context (143K → ~18K tokens)…   Context compacted.  Summary: refactoring auth module — completed login.ts,  session.ts in progress, token.ts pending.  Ready for next message.   New context size: 18,200 tokens (9% of window)

How does tracking total token spend reveal context bloat?

zsh — any directory

$ npx whoburnedmore  ↳ scanning claude code logs…   TODAY  input tokens     9.8M  output tokens    0.4M  input/output     24.5×  ← context bloat likely  estimated cost   $38.20   typical ratio: 5–10× input to output  24× suggests large repeated file reads in context

1
Start each new task with /clear or a fresh session
Never carry context from one unrelated task to the next. Old file reads and tool outputs from a previous task will silently inflate the cost of every new message.
2
Run /context at the start of expensive tasks
If you see context above 50%, compact before starting a task that involves reading many more files. Compacting now costs one round-trip; not compacting means paying for 100K+ extra tokens on every subsequent turn.
3
Keep your CLAUDE.md lean
Every token in CLAUDE.md is paid on every message of every session. A 5K-token CLAUDE.md that you never read manually costs 5K tokens per message — 500K tokens over a 100-message session. Review it periodically and remove stale sections.
4
Monitor weekly spend for context-bloat patterns
Run npx whoburnedmore weekly. If your input-to-output ratio trends up over several days, you are letting contexts grow too large before compacting. Catching the pattern early lets you adjust the habit before it hits your rate-limit budget.

200K

context window size

5–10×

healthy input/output ratio

15×+

ratio that signals context bloat

Context cost and rate-limit cost are the same budget

Related guides

How to Reduce Claude Code Token Usage

Ten ways to burn fewer tokens — starting with measuring where they actually go.

Claude Code Usage Limit Reached? Here's What to Do

Locked out mid-session? See what you used and when the limit resets.

How to Check Claude Code Token Usage

See your Claude Code tokens by day, model, and project — and how the built-in /usage compares.

← Browse all whoburnedmore guides