How to Check Hermes Agent Token Usage and Cost
Hermes Agent can route a single task through several providers in one run. That flexibility is the point — and it is also why nobody can tell you what they actually spent last week.
Quick answer
npx whoburnedmore— it reads Hermes Agent's local logs, totals tokens by day and model, estimates cost across providers, and ranks you. No account, nothing uploaded, and it handles every provider you route through in one table. 🔥Hermes Agent is Nous Research's self-improving terminal coding agent — the one with persistent memory across runs and a sandboxed execution loop. Its headline feature is reach: it can drive 300+ models across a long list of providers, picking whichever fits the step in front of it. Great for results, rough for accounting. Your spend ends up smeared across half a dozen billing dashboards that never add up to one honest number.
How do I see my Hermes Agent token usage?
Hermes keeps a local record of every model call it makes — which model, which provider, how many tokens in and out. You do not have to log into anything to read it. One command scans that record and folds it into a single daily ledger:
$ npx whoburnedmore↳ reading Hermes Agent run history…↳ 1,204 model calls across 6 providers HERMES AGENT — TOKENS BY DAY ──────────────────────────────────────── 2026-06-17 in 4,118,300 out 312,700 2026-06-16 in 2,640,900 out 201,400 2026-06-15 in 5,902,100 out 470,800 BY PROVIDER provider A 11.4M tokens provider B 6.8M tokens provider C 2.1M tokens 7-day total: 31.7M tokens est. $128.90
That last line is the one Hermes itself will never print, because Hermes does not know what any single provider charges — it just routes. whoburnedmore fills that gap by reading the run history off disk and attaching a rate to each model it finds, so the total is the sum of real calls rather than a guess.
Self-improving means more calls than you think
Because Hermes revises its own approach mid-task, a single prompt can fan out into many model calls — planning, sandboxed execution, a retry, a critique pass. Each one is billed. A per-day ledger is the only honest way to see how that loop accumulates; the agent's live output scrolls past too fast to tally.How do I track cost across Hermes's many models?
This is the part that makes Hermes different from a single-model tool. When one run touches several providers, no provider's own dashboard shows the whole picture — each one only sees its slice. Consolidating means pulling every slice back into one place and labeling it by where it ran 📊:
- Provider A (planning)53%
- Provider B (codegen)31%
- Provider C (review)10%
- Provider D (cheap drafts)6%
Notice how the cheapest provider can still carry a big share of calls while a pricier one carries most of the cost — the two are not the same chart. Splitting usage by provider is what lets you ask the useful question: is the expensive route earning its keep, or could the same step go to a cheaper model with no real loss in quality?
Why provider sprawl hides your real number
Suppose a week of Hermes work touched six providers. To total it by hand you would open six billing pages, normalize six different token-pricing conventions, line up six date ranges, and hope none of them rounded differently. Most people never do this, which is exactly how a routing agent quietly becomes the largest line on the bill. Reading one local log sidesteps the whole chore.
| Approach | Whole-run total | Per-model split | Daily history | Setup |
|---|---|---|---|---|
| One provider's dashboard | — | — | that provider | login each |
| Spreadsheet by hand | hours | |||
| whoburnedmore | one command |
How much is Hermes Agent costing me?
Cost is just usage times a rate — but with Hermes the rate is not one number, it is a different price for every model and provider you touched. So the honest total is a sum over calls, each priced by where it actually ran:
We will not quote per-token prices here, because for Hermes there is no single price to quote — rates vary by the model and provider you route to, and they move. What matters is the method: whoburnedmore reads each call, looks up the rate that applied to that specific model, and adds it in. A run that leaned on a premium provider for codegen will cost more than the same token count routed through a budget model, and the total reflects that honestly.
Already running other agents too?
If you also use other terminal coding agents alongside Hermes, the same command lists them side by side. See the cross-tool usage guide for the combined view — Hermes spend next to everything else, one table.Where does Hermes keep this data, and is it private?
Hermes writes its run history to a local directory on the machine that ran the agent — alongside the persistent memory it uses to improve between sessions. whoburnedmore reads only the numeric side of that record: token counts, model names, provider labels, and timestamps. It does not read your prompts, your code, or your file names, and nothing about them is uploaded. 🛡️
$ npx whoburnedmore --dry-runwould submit (aggregates only): tokens_in_total 31,742,800 tokens_out_total 2,544,100 models_seen 9 prompts / code / filenames: never read, never sent
Use --dry-run to print exactly what a submission would contain, or npx whoburnedmore --localto keep the whole breakdown on your own machine and skip the leaderboard entirely. Either way, the agent's memory and your source stay where Hermes left them.
models Hermes can route to
providers reconciled per run
command to total it all
Related guides
How to Check Your AI Coding Token Usage
The cross-tool overview: one command that totals your token usage and cost across every AI coding agent you run.
How to Check Amp, Droid, and Goose Token Usage
The newer agents don't have usage dashboards yet — one command covers all three.
The Best AI Coding Token Trackers in 2026
ccusage vs tokscale vs native dashboards vs whoburnedmore — a free, cross-tool comparison.