Claude Code Rate Limits, Explained

Claude Code enforces two independent token budgets at the same time. Understanding both is the only way to predict when you'll hit a wall — and avoid it.

By Arham WaniLast updated June 2026 whoburnedmore guide

Quick answer

Claude Code rate limits in 2026 operate on two separate windows: a rolling 5-hour window and a weekly aggregate cap. Both count input and output tokens, and hitting either one blocks all new requests until the relevant window resets. The built-in /usage shows your 5-hour utilization; npx whoburnedmore shows your full weekly picture. 📊

Most developers assume Claude Code has a single rate limit — one number, one reset. In reality there are two independent clocks running simultaneously. You can be well within your 5-hour window and still get blocked because your weekly aggregate tripped. This guide explains exactly how each window works, what triggers a 429-style block, and how to calculate your real utilization before you hit a wall.

How do the two rate-limit windows work?

The 5-hour rolling window is a sliding budget. The clock starts on your first request and expires 5 hours later. Every token — input and output — consumed in that span counts against the window. When the 5 hours elapse, the window resets and you start fresh. This is the limit most developers encounter during a long coding sprint.

The weekly cap is a separate, larger aggregate. It accumulates across all your 5-hour windows throughout the week and resets once per calendar week. Anthropic introduced it in 2025 to prevent a small number of extremely heavy users from degrading service for everyone else on shared infrastructure. If you routinely max out your 5-hour window multiple times a day, you will eventually encounter the weekly cap.

Token consumption pattern that triggers both limits — peak days burn 5-hour window twice (M tokens)

Which window resets first?

The 5-hour window resets 5 hours after your first request in that window — not from the last request, and not from a fixed clock. If you open Claude Code at 9:00 AM and make your first request at 9:15 AM, your window resets at 2:15 PM regardless of what you did in between. The weekly window resets on a fixed schedule, likely midnight UTC on a Monday, though Anthropic has not published a guaranteed reset time.

Clearing the 5-hour window does not clear the weekly cap

These are additive limits, not alternative ones. If you hit the weekly cap, waiting 5 hours will not help. The weekly counter keeps its value; only the weekly reset clears it. If you come back after 5 hours and you're still blocked, you have hit the weekly aggregate. 🔥

What triggers a rate-limit block and what does it look like?

Any API call — whether you typed a prompt manually, triggered a tool call, or loaded a file into context — consumes tokens against your limits. There is no separate bucket for context loading vs. generation. The block is categorical: once the limit is hit, all new requests fail until the window resets, even short ones.

zsh — ~/api-project

$ claude "add error handling to the payment service"   Claude Code  ✗ API Error 429  Rate limit exceeded. Usage limits for your account  have been reached. Try again after the window resets. $ claude /usage   Window usage  ████████████████████ 100%  (5-hour window full)  Resets in: 0h 47m  Weekly:    87% of weekly cap consumed

The /usagecommand gives you the 5-hour percentage and an approximate countdown to reset. It does not show the absolute token count or the weekly aggregate in token form — just percentages and an estimate of time remaining. That's useful for knowing when to try again, but useless for understanding your burn pattern or deciding whether to change your workflow.

How do I calculate my real utilization?

Utilization is your consumed tokens as a fraction of your window budget. Because Anthropic does not publish the exact token ceiling for each plan tier, you work backwards from the percentage that /usage reports plus the absolute token count from your local logs.

ceiling_5h=tokens_consumed÷utilization_%

Infer your window ceiling from the /usage percentage and your log-derived token count.

Run npx whoburnedmore immediately after a session that hit, say, 80% utilization. The token total in the output divided by 0.80 gives you your approximate window size for your plan. Repeat across a few sessions to get a reliable estimate. Once you know your ceiling, you can pace yourself.

zsh — any directory

$ npx whoburnedmore  ↳ reading claude code usage logs…   TODAY (last 5h window sampled)  input tokens      7.4M  output tokens     1.2M  total             8.6M  /usage reported ~78% → estimated ceiling ~11M   WEEKLY TOTAL  tokens    38.2M     cost $148.70

Pro vs Max: how the ceilings differ

Claude Pro and Claude Max have meaningfully different rate-limit ceilings. Max plans were specifically designed to give heavy users 5× or more headroom compared to Pro, with a correspondingly higher weekly aggregate. If you consistently hit the 5-hour window before your work session naturally ends, or you find your weekly total exhausted by Thursday, you are on the wrong plan tier for your actual usage volume. For comparison with Codex's two-window limit system, see Codex usage limits explained.

How does usage tracking help avoid rate limits?

The most effective prevention strategy is measurement before limits are triggered, not recovery after. Once you have a weekly baseline — say, 35M tokens over five working days — you can divide your remaining weekly budget across the days left in the week and pace your heaviest tasks accordingly.

1
Establish your baseline with 2 weeks of data
Run npx whoburnedmore at the end of each day for two weeks. Note which days and which types of tasks drove the highest input token counts. Refactoring large codebases and reading long files into context are typically the biggest drivers.
2
Monitor mid-session with /usage
Before starting any task you expect to be expensive — large refactors, reading many files, long back-and-forth refinement chains — check /usagefirst. If you're already above 60%, consider compacting your context or shifting the task to a new day.
3
Compact aggressively before heavy tasks
Run /compact to summarize and shrink your context before expensive operations. This reduces the per-message token cost by removing older turns that Claude has already acted on and no longer needs verbatim.
4
Track your weekly spend trend
Re-run npx whoburnedmore mid-week. If Tuesday plus Wednesday already exceeds half your typical weekly total, throttle Thursday and Friday to avoid a weekly-cap lockout on a deadline day.

Built-in command	Shows 5h %	Shows weekly total	Shows token count	Cross-tool
/usage (Claude Code)	yes	—	—	—
Claude web dashboard	—	partial	—	—
npx whoburnedmore	—			12+ tools

Use /usage for real-time window status; use whoburnedmore for pattern analysis and weekly totals.

rolling window

weekly aggregate

429

HTTP status when blocked

The limits apply to all request types equally

File reads, tool calls, web searches, and prompt messages all consume tokens against your rate-limit budget. There is no privileged category that avoids counting. Reading a 500-line file into context at the start of every turn can easily consume 20–30K tokens per message without producing any output tokens — and those input tokens count just as much as the response tokens do. ⏳

If you've already hit your limit and need to understand what you burned to avoid repeating it, see the recovery guide: Claude Code usage limit reached — what to do. If you also use Codex CLI and want to compare the two limit systems side by side, the Codex CLI usage check guidecovers Codex's equivalent 5-hour and weekly windows in detail.

Knowing your ceiling turns limits into a planning tool

Once you've used npx whoburnedmorefor a week and estimated your plan's token ceiling, rate limits stop being surprises and become a known constraint. A known constraint is something you can plan around — just like disk space or memory.

Related guides

Claude Code Usage Limit Reached? Here's What to Do

Locked out mid-session? See what you used and when the limit resets.

Codex Usage Limits: 5-Hour and Weekly Caps

The 5-hour vs weekly windows on a ChatGPT plan — and how to track consumption.

How to Check Claude Code Token Usage

See your Claude Code tokens by day, model, and project — and how the built-in /usage compares.

← Browse all whoburnedmore guides

Claude Code Rate Limits, Explained

Claude Code enforces two independent token budgets at the same time. Understanding both is the only way to predict when you'll hit a wall — and avoid it.

By Arham WaniLast updated June 2026 whoburnedmore guide

Quick answer

How do the two rate-limit windows work?

Token consumption pattern that triggers both limits — peak days burn 5-hour window twice (M tokens)

Which window resets first?

Clearing the 5-hour window does not clear the weekly cap

What triggers a rate-limit block and what does it look like?

zsh — ~/api-project

$ claude "add error handling to the payment service"   Claude Code  ✗ API Error 429  Rate limit exceeded. Usage limits for your account  have been reached. Try again after the window resets. $ claude /usage   Window usage  ████████████████████ 100%  (5-hour window full)  Resets in: 0h 47m  Weekly:    87% of weekly cap consumed

How do I calculate my real utilization?

ceiling_5h=tokens_consumed÷utilization_%

Infer your window ceiling from the /usage percentage and your log-derived token count.

zsh — any directory

$ npx whoburnedmore  ↳ reading claude code usage logs…   TODAY (last 5h window sampled)  input tokens      7.4M  output tokens     1.2M  total             8.6M  /usage reported ~78% → estimated ceiling ~11M   WEEKLY TOTAL  tokens    38.2M     cost $148.70

Pro vs Max: how the ceilings differ

How does usage tracking help avoid rate limits?

1
Establish your baseline with 2 weeks of data
Run npx whoburnedmore at the end of each day for two weeks. Note which days and which types of tasks drove the highest input token counts. Refactoring large codebases and reading long files into context are typically the biggest drivers.
2
Monitor mid-session with /usage
Before starting any task you expect to be expensive — large refactors, reading many files, long back-and-forth refinement chains — check /usagefirst. If you're already above 60%, consider compacting your context or shifting the task to a new day.
3
Compact aggressively before heavy tasks
Run /compact to summarize and shrink your context before expensive operations. This reduces the per-message token cost by removing older turns that Claude has already acted on and no longer needs verbatim.
4
Track your weekly spend trend
Re-run npx whoburnedmore mid-week. If Tuesday plus Wednesday already exceeds half your typical weekly total, throttle Thursday and Friday to avoid a weekly-cap lockout on a deadline day.

Built-in command	Shows 5h %	Shows weekly total	Shows token count	Cross-tool
/usage (Claude Code)	yes	—	—	—
Claude web dashboard	—	partial	—	—
npx whoburnedmore	—			12+ tools

Use /usage for real-time window status; use whoburnedmore for pattern analysis and weekly totals.

rolling window

weekly aggregate

429

HTTP status when blocked

The limits apply to all request types equally

Knowing your ceiling turns limits into a planning tool

Related guides

Claude Code Usage Limit Reached? Here's What to Do

Locked out mid-session? See what you used and when the limit resets.

Codex Usage Limits: 5-Hour and Weekly Caps

The 5-hour vs weekly windows on a ChatGPT plan — and how to track consumption.

How to Check Claude Code Token Usage

See your Claude Code tokens by day, model, and project — and how the built-in /usage compares.

← Browse all whoburnedmore guides