Claude Code Rate Limits, Explained
Claude Code enforces two independent token budgets at the same time. Understanding both is the only way to predict when you'll hit a wall — and avoid it.
Quick answer
/usage shows your 5-hour utilization; npx whoburnedmore shows your full weekly picture. 📊Most developers assume Claude Code has a single rate limit — one number, one reset. In reality there are two independent clocks running simultaneously. You can be well within your 5-hour window and still get blocked because your weekly aggregate tripped. This guide explains exactly how each window works, what triggers a 429-style block, and how to calculate your real utilization before you hit a wall.
How do the two rate-limit windows work?
The 5-hour rolling window is a sliding budget. The clock starts on your first request and expires 5 hours later. Every token — input and output — consumed in that span counts against the window. When the 5 hours elapse, the window resets and you start fresh. This is the limit most developers encounter during a long coding sprint.
The weekly cap is a separate, larger aggregate. It accumulates across all your 5-hour windows throughout the week and resets once per calendar week. Anthropic introduced it in 2025 to prevent a small number of extremely heavy users from degrading service for everyone else on shared infrastructure. If you routinely max out your 5-hour window multiple times a day, you will eventually encounter the weekly cap.
Which window resets first?
The 5-hour window resets 5 hours after your first request in that window — not from the last request, and not from a fixed clock. If you open Claude Code at 9:00 AM and make your first request at 9:15 AM, your window resets at 2:15 PM regardless of what you did in between. The weekly window resets on a fixed schedule, likely midnight UTC on a Monday, though Anthropic has not published a guaranteed reset time.
Clearing the 5-hour window does not clear the weekly cap
These are additive limits, not alternative ones. If you hit the weekly cap, waiting 5 hours will not help. The weekly counter keeps its value; only the weekly reset clears it. If you come back after 5 hours and you're still blocked, you have hit the weekly aggregate. 🔥What triggers a rate-limit block and what does it look like?
Any API call — whether you typed a prompt manually, triggered a tool call, or loaded a file into context — consumes tokens against your limits. There is no separate bucket for context loading vs. generation. The block is categorical: once the limit is hit, all new requests fail until the window resets, even short ones.
$ claude "add error handling to the payment service" Claude Code ✗ API Error 429 Rate limit exceeded. Usage limits for your account have been reached. Try again after the window resets. $ claude /usage Window usage ████████████████████ 100% (5-hour window full) Resets in: 0h 47m Weekly: 87% of weekly cap consumed
The /usagecommand gives you the 5-hour percentage and an approximate countdown to reset. It does not show the absolute token count or the weekly aggregate in token form — just percentages and an estimate of time remaining. That's useful for knowing when to try again, but useless for understanding your burn pattern or deciding whether to change your workflow.
How do I calculate my real utilization?
Utilization is your consumed tokens as a fraction of your window budget. Because Anthropic does not publish the exact token ceiling for each plan tier, you work backwards from the percentage that /usage reports plus the absolute token count from your local logs.
Run npx whoburnedmore immediately after a session that hit, say, 80% utilization. The token total in the output divided by 0.80 gives you your approximate window size for your plan. Repeat across a few sessions to get a reliable estimate. Once you know your ceiling, you can pace yourself.
$ npx whoburnedmore ↳ reading claude code usage logs… TODAY (last 5h window sampled) input tokens 7.4M output tokens 1.2M total 8.6M /usage reported ~78% → estimated ceiling ~11M WEEKLY TOTAL tokens 38.2M cost $148.70
Pro vs Max: how the ceilings differ
Claude Pro and Claude Max have meaningfully different rate-limit ceilings. Max plans were specifically designed to give heavy users 5× or more headroom compared to Pro, with a correspondingly higher weekly aggregate. If you consistently hit the 5-hour window before your work session naturally ends, or you find your weekly total exhausted by Thursday, you are on the wrong plan tier for your actual usage volume. For comparison with Codex's two-window limit system, see Codex usage limits explained.
How does usage tracking help avoid rate limits?
The most effective prevention strategy is measurement before limits are triggered, not recovery after. Once you have a weekly baseline — say, 35M tokens over five working days — you can divide your remaining weekly budget across the days left in the week and pace your heaviest tasks accordingly.
- 1
Establish your baseline with 2 weeks of data
Runnpx whoburnedmoreat the end of each day for two weeks. Note which days and which types of tasks drove the highest input token counts. Refactoring large codebases and reading long files into context are typically the biggest drivers. - 2
Monitor mid-session with /usage
Before starting any task you expect to be expensive — large refactors, reading many files, long back-and-forth refinement chains — check/usagefirst. If you're already above 60%, consider compacting your context or shifting the task to a new day. - 3
Compact aggressively before heavy tasks
Run/compactto summarize and shrink your context before expensive operations. This reduces the per-message token cost by removing older turns that Claude has already acted on and no longer needs verbatim. - 4
Track your weekly spend trend
Re-runnpx whoburnedmoremid-week. If Tuesday plus Wednesday already exceeds half your typical weekly total, throttle Thursday and Friday to avoid a weekly-cap lockout on a deadline day.
| Built-in command | Shows 5h % | Shows weekly total | Shows token count | Cross-tool |
|---|---|---|---|---|
| /usage (Claude Code) | yes | — | — | — |
| Claude web dashboard | — | partial | — | — |
| npx whoburnedmore | — | 12+ tools |
rolling window
weekly aggregate
HTTP status when blocked
The limits apply to all request types equally
File reads, tool calls, web searches, and prompt messages all consume tokens against your rate-limit budget. There is no privileged category that avoids counting. Reading a 500-line file into context at the start of every turn can easily consume 20–30K tokens per message without producing any output tokens — and those input tokens count just as much as the response tokens do. ⏳If you've already hit your limit and need to understand what you burned to avoid repeating it, see the recovery guide: Claude Code usage limit reached — what to do. If you also use Codex CLI and want to compare the two limit systems side by side, the Codex CLI usage check guidecovers Codex's equivalent 5-hour and weekly windows in detail.
Knowing your ceiling turns limits into a planning tool
Once you've usednpx whoburnedmorefor a week and estimated your plan's token ceiling, rate limits stop being surprises and become a known constraint. A known constraint is something you can plan around — just like disk space or memory.Related guides
Claude Code Usage Limit Reached? Here's What to Do
Locked out mid-session? See what you used and when the limit resets.
Codex Usage Limits: 5-Hour and Weekly Caps
The 5-hour vs weekly windows on a ChatGPT plan — and how to track consumption.
How to Check Claude Code Token Usage
See your Claude Code tokens by day, model, and project — and how the built-in /usage compares.