How to Check Hermes Agent Token Usage and Cost

Hermes Agent can route a single task through several providers in one run. That flexibility is the point — and it is also why nobody can tell you what they actually spent last week.

By Arham WaniLast updated June 18, 2026 whoburnedmore guide

Quick answer

To total your Hermes Agent spend, run npx whoburnedmore— it reads Hermes Agent's local logs, totals tokens by day and model, estimates cost across providers, and ranks you. No account, nothing uploaded, and it handles every provider you route through in one table. 🔥

Hermes Agent is Nous Research's self-improving terminal coding agent — the one with persistent memory across runs and a sandboxed execution loop. Its headline feature is reach: it can drive 300+ models across a long list of providers, picking whichever fits the step in front of it. Great for results, rough for accounting. Your spend ends up smeared across half a dozen billing dashboards that never add up to one honest number.

How do I see my Hermes Agent token usage?

Hermes keeps a local record of every model call it makes — which model, which provider, how many tokens in and out. You do not have to log into anything to read it. One command scans that record and folds it into a single daily ledger:

zsh — ~/work

$ npx whoburnedmore↳ reading Hermes Agent run history…↳ 1,204 model calls across 6 providers   HERMES AGENT — TOKENS BY DAY  ────────────────────────────────────────  2026-06-17  in 4,118,300  out 312,700  2026-06-16  in 2,640,900  out 201,400  2026-06-15  in 5,902,100  out 470,800   BY PROVIDER  provider A    11.4M tokens  provider B     6.8M tokens  provider C     2.1M tokens   7-day total:  31.7M tokens   est. $128.90

That last line is the one Hermes itself will never print, because Hermes does not know what any single provider charges — it just routes. whoburnedmore fills that gap by reading the run history off disk and attaching a rate to each model it finds, so the total is the sum of real calls rather than a guess.

Self-improving means more calls than you think

Because Hermes revises its own approach mid-task, a single prompt can fan out into many model calls — planning, sandboxed execution, a retry, a critique pass. Each one is billed. A per-day ledger is the only honest way to see how that loop accumulates; the agent's live output scrolls past too fast to tally.

How do I track cost across Hermes's many models?

This is the part that makes Hermes different from a single-model tool. When one run touches several providers, no provider's own dashboard shows the whole picture — each one only sees its slice. Consolidating means pulling every slice back into one place and labeling it by where it ran 📊:

Provider A (planning)53%
Provider B (codegen)31%
Provider C (review)10%
Provider D (cheap drafts)6%

Notice how the cheapest provider can still carry a big share of calls while a pricier one carries most of the cost — the two are not the same chart. Splitting usage by provider is what lets you ask the useful question: is the expensive route earning its keep, or could the same step go to a cheaper model with no real loss in quality?

Why provider sprawl hides your real number

Suppose a week of Hermes work touched six providers. To total it by hand you would open six billing pages, normalize six different token-pricing conventions, line up six date ranges, and hope none of them rounded differently. Most people never do this, which is exactly how a routing agent quietly becomes the largest line on the bill. Reading one local log sidesteps the whole chore.

Approach	Whole-run total	Per-model split	Daily history	Setup
One provider's dashboard	—	—	that provider	login each
Spreadsheet by hand				hours
whoburnedmore				one command

Only the local-log approach reconciles every provider Hermes routed through into one total.

How much is Hermes Agent costing me?

Cost is just usage times a rate — but with Hermes the rate is not one number, it is a different price for every model and provider you touched. So the honest total is a sum over calls, each priced by where it actually ran:

cost=Σ (in_call×r_in+out_call×r_out)

Each model call is priced at its own provider's input and output rate, then summed across the whole run.

We will not quote per-token prices here, because for Hermes there is no single price to quote — rates vary by the model and provider you route to, and they move. What matters is the method: whoburnedmore reads each call, looks up the rate that applied to that specific model, and adds it in. A run that leaned on a premium provider for codegen will cost more than the same token count routed through a budget model, and the total reflects that honestly.

Already running other agents too?

If you also use other terminal coding agents alongside Hermes, the same command lists them side by side. See the cross-tool usage guide for the combined view — Hermes spend next to everything else, one table.

Where does Hermes keep this data, and is it private?

Hermes writes its run history to a local directory on the machine that ran the agent — alongside the persistent memory it uses to improve between sessions. whoburnedmore reads only the numeric side of that record: token counts, model names, provider labels, and timestamps. It does not read your prompts, your code, or your file names, and nothing about them is uploaded. 🛡️

zsh — what gets read vs. what stays put

$ npx whoburnedmore --dry-runwould submit (aggregates only):  tokens_in_total   31,742,800  tokens_out_total   2,544,100  models_seen        9  prompts / code / filenames:  never read, never sent

Use --dry-run to print exactly what a submission would contain, or npx whoburnedmore --localto keep the whole breakdown on your own machine and skip the leaderboard entirely. Either way, the agent's memory and your source stay where Hermes left them.

300+

models Hermes can route to

providers reconciled per run

command to total it all

Related guides

How to Check Your AI Coding Token Usage

The cross-tool overview: one command that totals your token usage and cost across every AI coding agent you run.

How to Check Amp, Droid, and Goose Token Usage

The newer agents don't have usage dashboards yet — one command covers all three.

The Best AI Coding Token Trackers in 2026

ccusage vs tokscale vs native dashboards vs whoburnedmore — a free, cross-tool comparison.

← Browse all whoburnedmore guides