Capability Tokens

An HMAC-bound credential that scopes an AI agent to a specific set of tools, a specific tenant, and a specific time window. Capability tokens are the foundational control of agent runtime authorization: they replace coarse-grained service-identity authentication (which only proves the agent is who it claims to be) with fine-grained per-tool authorization (which proves the agent may attempt this specific tool call). They are the first check the gateway runs on every tool invocation, and they are the substrate the remaining five controls operate against.

Why capability tokens matter

When an AI agent runs in production, it holds credentials — typically a Microsoft Entra ID Managed Identity, an AWS IAM role, or a GCP service account — that grant it access to internal tools. Those credentials are necessarily broad: the same agent might need to read from the CRM, write to the ticketing system, query a database, and call an external webhook. Identity layers like Entra ID can authenticate that the agent is the agent, but they cannot decide which of those tools the agent may attempt on a given call. The result is excessive agency by default — the OWASP LLM06 risk class.

Capability tokens close that gap. The gateway mints a token at session start that explicitly enumerates the tools the agent may invoke, the tenant whose data it may touch, and the time window within which the token is valid. Every subsequent tool call carries the token; the gateway verifies it before any other check runs. Calls that present an invalid, expired, or out-of-scope token are refused at the first hop with −32010 — the gateway never forwards them to the underlying tool.

How IntentGate implements capability tokens

IntentGate's capability tokens are bound by a Hash-based Message Authentication Code (HMAC) using a per-tenant key derived from the gateway's master key. Each token's payload includes the agent identity, the allowed tool catalogue, the tenant scope, the issuance timestamp, the expiry timestamp, and a unique JTI (JWT ID) for revocation tracking. The HMAC binding ensures that the token cannot be forged, modified, or replayed against a different tool or tenant — any tampering changes the payload and invalidates the HMAC.

Token issuance happens through a minting endpoint that runs the same six-check pipeline as a regular tool call, but in capability-issuance mode: the caller's identity is verified, the requested scope is checked against the caller's permissions, and the resulting token is signed and returned. The console-pro mint-license tool provides operator access to this flow; the gateway's API exposes it to identity bridges so capability tokens can be minted automatically from upstream identity events.

Revocation is handled by a deny-list of JTIs the gateway checks on every token presentation. Adding a JTI to the deny-list immediately invalidates any subsequent presentation of that token, regardless of expiry. The master key itself can be rotated as the ultimate revocation: rotation invalidates every previously-minted token in a single operation, which is the intended emergency-containment behavior for incident response.

Capability attenuation across sub-agents

Multi-agent architectures — where a parent agent delegates work to one or more sub-agents — require a mechanism to limit the blast radius of any individual sub-agent. Capability attenuation is that mechanism: a parent agent can derive a more restrictive token from its own, signing the derivation with its capability key, and pass the attenuated token to a sub-agent. The sub-agent's attenuated token can only ever be a subset of the parent's: fewer tools, narrower tenant, shorter window. The gateway verifies the attenuation chain on every call.

Attenuation defeats two distinct failure modes. First, the sub-agent cannot accidentally call a tool the parent didn't intend it to call — the attenuated token's scope makes that structurally impossible. Second, if the sub-agent is compromised (through prompt injection or other means), the blast radius is bounded by the attenuation rather than by the parent's full capability. The audit chain records the full attenuation lineage so investigators can trace any sub-agent action back to the originating parent and originating user.

Error code and observability

Capability token failures return JSON-RPC error code −32010. The error payload includes the specific failure mode (token-missing, token-invalid, token-expired, scope-mismatch, tenant-mismatch, JTI-revoked) and is logged to the per-tenant hash-chained audit log. Standard SIEM adapters (Microsoft Sentinel, Splunk, Elastic, Chronicle) route on the error code; operators can build dashboards for token failures by scope, by tenant, or by failure mode without parsing the payload.

Related controls

Capability tokens are the substrate for the other five controls — intent enforcement, policy, budget tracking, memory provenance, and bidirectional PII filtering — each of which runs only after the token has been verified. A call with no valid capability token never reaches any of those checks. See the Agent Runtime Authorization category page for the full picture, or the Glossary for definitions of every term used here.

Frequently asked questions

How is a capability token different from a JWT or OAuth token?

JWT and OAuth tokens authenticate the caller — they prove the agent is who it claims to be. Capability tokens authorize the action — they declare which specific tools the agent is allowed to attempt and under what tenant scope. JWT answers "who"; capability tokens answer "may attempt what." Both can coexist in the same request: the agent presents an Entra ID JWT to prove identity and a capability token to prove authorization. IntentGate verifies both.

What is capability attenuation?

Capability attenuation is the process of deriving a more restrictive token from a more permissive one, typically when a parent agent delegates work to a sub-agent. The attenuated token can only be used for a subset of the parent's tools, for a shorter time window, and for a narrower tenant scope. Attenuation is how multi-agent architectures contain blast radius across sub-agent delegation.

What happens when the IntentGate master key is rotated?

Master key rotation invalidates every previously-minted capability token by design. Active agent sessions must re-mint tokens against the new key. This is the intended emergency-containment behavior for incident response or planned tear-downs; it should not be done casually because every in-flight session is interrupted. The gateway exposes a planned-rotation workflow that allows operators to issue new tokens against a new key in parallel with the old key for a defined transition window.