Agent Router Enterprise

Stop Asking People to Tag Their AI Spend

About half our AI spend last week was on API keys with no team tag, no workload tag, no owner. That's not a discipline problem, it's a product problem — and the fix isn't a better tagging UI.

Paul Merrison

May 19, 2026

Stop Asking People to Tag Their AI Spend

You will, sooner or later, be asked to break down your team’s AI spend by team. The CFO will want it for chargeback. Your VP will want it for the budget review. Your own engineering managers will want it for 1:1s. And when you go to your AI cost dashboard to produce the breakdown, you will discover that it can’t — at least not for half the spend.

We had this experience last week. Across about thirty Tetrate engineers using Agent Router to route their LLM traffic, we spent roughly $4,000 in seven days. About half of that spend was on API keys with no team tag, no workload tag, and no owner beyond an email address. The tag column was right there in the dashboard. People just hadn’t filled it in.

Key takeaways

AI cost attribution fails the same way cloud cost attribution failed: optional tagging never gets done. Roughly half our own AI spend last week was on untagged keys.
The fix is to make attribution a property of key issuance, not a property of usage. Every key inherits an owner, team, agent, and workload at the moment it’s created.
Every AI request needs four labels — owner, team, agent, workload — because each one answers a different manager’s question.
The routing layer is the right place to enforce this, because it’s the only point in the stack with the full picture at request time.

This is the universal experience of AI cost attribution right now, and it is not going to fix itself. Here are the four mistakes managers make about it, and the one product decision that resolves all of them.

Mistake 1: Relying on people to tag their own AI spend

Cloud cost management taught the industry this lesson over a decade ago. If tagging is optional, it doesn’t happen. Engineers spinning up resources to ship features don’t pause to fill in metadata fields whose value accrues to someone in finance, six months from now. AWS spent ten years learning this. AI cost tooling is currently rediscovering it.

Our 50% untagged spend is not a discipline failure. It is a product failure. The dashboard offered a tag column. The people creating API keys did not see why filling it in was their problem. They were right.

The fix is not a reminder Slack channel, a Friday tagging review, or a Confluence page about tagging standards. None of those will work. The fix is to remove the choice.

Mistake 2: Reconstructing AI cost attribution from the bill

The next temptation is to push the problem to month-end. Take the invoice, run it through allocation logic — by user, by project, by team — and produce a chargeback report. This works for cloud spend because cloud resources have stable identifiers that map back to projects.

AI spend doesn’t. An API key called code-agent-key could belong to anyone. The user field tells you who created it, not which agent it serves, which workflow runs through it, or which team owns the workflow. Reconstructing those mappings after the fact requires institutional memory you don’t have — by the time the bill arrives, the engineer who created the key is on a different team, the agent has been renamed twice, and the workload has migrated to a different repo.

The right time to capture attribution is at the moment the key is issued. Anything later is archaeology, and archaeology is what you do when the records are lost.

Mistake 3: Tagging the wrong dimension

Even teams that do tag often tag the wrong dimension. They tag a key with one label — “team-platform” or “agent-cost” — which sounds right and is useless. What you actually need on every request is at least four facts:

Owner. The individual person responsible for the key.
Team. The team that pays for the spend.
Agent. The specific agent or application using the key.
Workload. The use case — coding assistance, security questionnaires, CI automation, research, customer support.

You need all four because each one answers a different manager’s question. Finance wants spend by team. Platform engineering wants spend by agent. A team lead wants spend by user. Product wants spend by workload. The same dollar of spend has to be sliceable four ways, which means it has to be labelled four ways at the time of the request — not derived afterwards from a key name or a guess about ownership.

[SCREENSHOT: Agent Router key list, four label columns (owner, team, agent, workload), redacted]

Mistake 4: Treating “untagged” as a valid state

If your cost dashboard lets a request through without attribution, you will have untagged spend forever. The 50% in our own data is exactly what happens when the system tolerates the gap. The only way to drive it to zero is to make the gap impossible.

That means: no key issued without the four labels above. No UI path that skips them. No defaults that paper over missing information. If you can’t say who owns this key and what it’s for, the system doesn’t give you a key.

This sounds bureaucratic. It is not. The friction lasts thirty seconds at key issuance and permanently solves a problem that otherwise gets worse every month.

The fix: attribution at provisioning, not at tagging

The single architectural choice that makes AI cost attribution work is this: attribution is a property of key issuance, not of usage. Every key, at the moment it’s created, inherits an owner, a team, an agent, and a workload. Those values flow with every request the key authenticates. They appear on every line of the cost ledger automatically. Nobody types anything into a tag field. Nobody forgets.

In our own product, this is the move that closes the gap. The key issuance form requires the four fields. The dashboard groups by any of them. The CFO’s chargeback report and the engineering manager’s 1:1 spend report come out of the same data, sliced differently. There is no “tagging review” anywhere in the process.

The consequence is that the cost dashboard stops being a separate concern from the routing platform. They become the same product. The routing layer is the only point in the stack that has the full picture — who issued the credential, who’s using it, which agent it belongs to, and what workload it serves — so that’s where attribution belongs.

[SCREENSHOT: Cost dashboard, top spend grouped by workload with the four-way breakdown visible, redacted]

What you get when AI cost attribution works

When attribution is provisioned, four reports become trivial:

Top keys by spend. Concentration is real and worth knowing about. In our data, four keys account for half of last week’s spend; ten keys account for 80%. That’s where your attention pays off.
Spend by team. Chargeback, capacity planning, renewal conversations.
Spend by agent. Which internal tools are worth their cost. Which ones have grown faster than anyone realized.
Spend by workload. Which use cases the company is actually spending its AI budget on — as opposed to which ones get talked about in standup.

None of those reports require a manual tag pass. They are byproducts of issuing keys correctly in the first place.

Tetrate believes the routing layer is the right place to capture AI cost attribution, because it’s the only place in the stack that has the full picture at request time — who issued the key, who’s using it, which agent it belongs to, what workload it serves. Tetrate Agent Router Enterprise enforces attribution at key issuance and surfaces the four breakdowns above out of the box. Built on the battle-hardened Envoy AI Gateway, it gives engineering leaders the cost breakdowns invoices don’t. If you’re trying to answer “where is our AI spend actually going?” for your own team, let’s talk.

Frequently asked questions

What is AI cost attribution?

AI cost attribution is the practice of tying each unit of AI spend — typically each API request or each token consumed — back to the team, person, agent, and workload it belongs to. Without it, AI costs show up as a single line on the bill, with no way to allocate them to the parts of the business that incurred them. It’s the foundation for chargeback, capacity planning, and ROI analysis on AI investments.

Why doesn’t manual tagging work for AI spend?

Manual tagging is optional friction that benefits someone other than the person filling it in. Engineers creating API keys to ship features don’t pause for metadata fields whose value accrues to finance teams. Cloud cost management spent a decade learning this; AI cost tooling is rediscovering it now. The fix is to make attribution mandatory at key issuance, not optional at request time.

How is AI cost attribution different from cloud cost attribution?

Cloud resources have stable identifiers — account IDs, project IDs, resource tags — that map cleanly to organizational structure. AI spend flows through API keys, which are issued ad hoc, often named arbitrarily, and sometimes shared across workloads. Without a structured issuance process, there is no reliable way to map an AI charge back to the team or workload that incurred it. The fix is to capture attribution at the moment the API key is created.

What labels should every AI request carry?

At minimum, four: an owner (the individual responsible), a team (who pays), an agent (which application is making the call), and a workload (the specific use case — coding, security, CI, research, etc.). Each label answers a different manager’s question, so all four need to be present on every request.

How do you allocate AI spend by team without per-user API keys?

Per-key attribution at issuance is the cleanest model. If multiple users share a key — for example, a service account used by a CI pipeline — the team and workload are still well-defined at the key level, even if no individual user is. Issue separate keys for separate teams or agents, and the breakdown takes care of itself.

What happens to existing untagged API keys?

You have to do a one-time backfill. For high-spend keys it’s worth the effort: identify the owner, team, agent, and workload, and label them. For the long tail of low-spend test keys, the simpler move is to rotate them — issue replacements through the new attribution-required flow and let the old ones expire. Either way, the goal is to eliminate “untagged” as a state the system permits.

Paul Merrison

May 19, 2026

Building AI agents

Agent Router Enterprise provides a managed AI Gateway, MCP Gateway, and AI Guardrails in your dedicated instance. Graduate agents from prototype to production with consistent model access, governed tool use, and runtime supervision — built on Envoy AI Gateway by its creators.

AI Gateway – Unified model catalog with automatic fallback across providers

MCP Gateway – Curated tool access with per-profile authentication and filtering

AI Guardrails – Enforce policies, prevent data loss, and supervise agent behavior

Learn more

Replacing NGINX Ingress

Tetrate Enterprise Gateway for Envoy (TEG) is the enterprise-ready replacement for NGINX Ingress Controller. Built on Envoy Gateway and the Kubernetes Gateway API, TEG delivers advanced traffic management, security, and observability without vendor lock-in.

100% upstream Envoy Gateway – CVE-protected builds

Kubernetes Gateway API native – Modern, portable, and extensible ingress

Enterprise-grade support – 24/7 production support from Envoy experts

Learn more

Announcing token brokering for cost control in Tetrate Agent Router Enterprise

Stop Asking People to Tag Their AI Spend

Mistake 1: Relying on people to tag their own AI spend

Mistake 2: Reconstructing AI cost attribution from the bill

Mistake 3: Tagging the wrong dimension

Mistake 4: Treating “untagged” as a valid state

The fix: attribution at provisioning, not at tagging

What you get when AI cost attribution works

Frequently asked questions

Building AI agents

Replacing NGINX Ingress

Ready to enhance your
network
with more
intelligence?

Announcing token brokering for cost control in Tetrate Agent Router Enterprise

Stop Asking People to Tag Their AI Spend

Mistake 1: Relying on people to tag their own AI spend

Mistake 2: Reconstructing AI cost attribution from the bill

Mistake 3: Tagging the wrong dimension

Mistake 4: Treating “untagged” as a valid state

The fix: attribution at provisioning, not at tagging

What you get when AI cost attribution works

Frequently asked questions

Building AI agents

Replacing NGINX Ingress

Ready to enhance your network with more intelligence?

Ready to enhance your
network
with more
intelligence?