Bet 05 · Agent runtime governance

Your AI Agents Have No Badge, No Boss, and No Audit Trail

Your customers are about to show up at your platform with their AI agents. Your IAM was built for humans. Your fraud models were trained on humans. Now what?

A friend of a friend runs consumer financial services security at a top-twenty bank. We sat next to each other at dinner last month, both of us mid-conversation with other people, when he turned and asked the question that has been keeping him up.

“We know our customers are going to start showing up with AI agents. Not in two years. Now. They want their assistant to check balances, move money, dispute charges. We want to let them. The question we cannot answer is how.”

He listed the open questions. How do we authenticate the agent itself, separate from the customer? How do we verify that the agent is actually authorized to act on behalf of this customer? What is the agent allowed to do, and who decides: us, or the customer, or some combination? If the agent moves five thousand dollars to the wrong account, who is responsible: the customer who delegated, the agent vendor, or us for letting it happen? And what’s the standard we hold the agent to? Same as a human? Higher? Different category entirely?

He had pages of questions like these. He wasn’t asking how to solve them. He was asking who is even supposed to be answering them.

The new actor

Your identity stack governs humans. You spent ten, maybe twenty years getting it right. On the customer side: KYC at onboarding, MFA on login, device binding, behavioral biometrics, step-up auth on risky actions, fraud detection, dispute and chargeback workflows. On the workforce side: SSO, role-based access, periodic reviews, joiners-movers-leavers. Different stacks, different teams, same underlying assumption. One human per account, and the account is the actor.

Your network stack governs devices. Posture, segmentation, ZTNA. Your data stack governs records. Classification, encryption, and retention so the lawyers will sign off.

You have an entire fourth category now and (maybe) nothing in your stack governs it.

AI agents are not customers, employees, devices, or records. They are not the human you onboarded, risk-scored, or fingerprinted across devices. They run on a customer’s phone, in a vendor’s infrastructure, sometimes in a cloud you will never see. And the ones you have not built a policy for are already running inside your customers’ lives, getting ready to show up at your platform asking to act on their behalf.

The closest analogue in your existing program is a service account. But a service account does one thing forever and belongs to one team. An agent decides. The same agent, with the same prompt and the same tools, can do different things at different times because the model behind it is non-deterministic and the input is contextual. And the same agent, acting for one customer today and a different customer tomorrow, is not a single actor at all. It is a class of actors authorized by principals who are not in the room at the moment of action.

You cannot govern this with a policy you wrote for a customer, an employee, a device, or a service account.

What goes wrong

Walk through the question my dinner friend was actually asking. A customer wants to give their AI agent the ability to check their bank balance. Low-risk action. Most banks could accept that today by having the customer copy a token into the agent’s settings. OK, that works once.

Now the agent wants to move money to other accounts the customer owns. Different consent threshold, different verification cadence, different auditability. Does the customer confirm every transfer, defeating most of the value, or pre-authorize a set of behaviors, accepting a different risk? Who is liable if the agent moves money to an account the customer did intend to fund, but at a time they would not have approved if asked?

Now the agent wants to dispute a charge. The bank’s dispute process assumes a human filing. The fraud team’s models are trained on human patterns. The agent’s pattern is different by design. Is it fraud, a legitimate delegated action, or an attack from a compromised agent? The bank’s telemetry can’t tell. The fallback is “block and ask the human to confirm.” Block at scale and the agent stops being useful. Allow at scale and the bank stops being safe.

This is one institution and one product line. Multiply by every consumer financial product, every healthcare portal, every government service, every retail account. The number of institutions trying to answer the same questions in parallel right now is somewhere north of a thousand, and each one is inventing the answer locally. No single institution is the right level. The answer has to be a standard that crosses every institution.

One customer, many agents

The framing so far has been one customer, one agent. That is not the world that’s coming. The world that’s coming has the customer running an accounting agent that reads transactions and categorizes them, a payments agent that pays bills and moves money between known accounts, an investment agent that rebalances within a risk profile, and a tax agent that pulls statements once a year. Different agents from different vendors, each scoped to a different function, all acting on the same customer’s account in parallel. The customer expects every one of them to work.

Your CIAM stack does not have a slot for “this agent of this customer can do this subset of what the customer can do.” The closest primitive is an OAuth-style scoped token, built for first-party integrations, not a customer-curated bench of third-party agents. There is no canonical place to express that the accounting agent reads but cannot move money, that the payments agent has a daily cap, that the investment agent’s authority lapses if the risk profile changes.

It gets harder. A customer puts assets in a revocable trust. The trustee is a different human. Can the trustee’s agent act on the trust’s accounts? What about joint accounts where both holders have agents and the agents disagree on whether to authorize a transfer? Powers of attorney. Guardianships. Custodial accounts for minors. Your data model already encodes some of these structures for human actors. The agent layer has to express the same structures or it collapses distinctions your existing law and policy depend on.

And then the question my dinner friend kept circling: can an agent delegate to another agent? A customer’s personal-CFO agent decides at runtime to call a tax-prep sub-agent and hand it scoped credentials to read transactions. Whose credential reaches your edge? Whose audit trail is canonical? Do you see the CFO agent, the tax sub-agent, or both? Today the honest answer is that the agent vendors are deciding on the fly, and the institution often does not see that the sub-delegation happened.

What looked like a pair is a graph. One principal at the top. Many directly-delegated agents. Some of those agents delegating further. Each node an actor. Each edge a delegation. Your edge sees only the outermost node holding credentials. The audit trail breaks at the first sub-delegation you cannot see.

These are the defaults to lock down before the agent vendors lock them down for you by silence. The ones you accept passively here will be very hard to take back.

The gap is structural, not technical

There are vendors selling AI agent runtime governance right now. Watchlight AI is one. There will be a dozen more by the end of the year. They are real products and they will solve real problems inside the enterprise.

None of them solves the cross-institutional problem at the edge, where your customer’s agent meets your platform. That problem requires a standard. The bank cannot ship a proprietary “how to authenticate the AI agent acting for our customer” API and expect every agent vendor to integrate with it. The agent vendor cannot ship a proprietary “how I represent the human I’m acting for” claim and expect every institution to accept it. Both are happening anyway, in pockets. The trap is two-sided. Ship first and you build against a standard that does not exist yet, then rebuild when it lands. Wait and your customers go unserved while their agents work everywhere else. Either way, someone else writes the standard you end up living with.

This is the discipline gap, moved outside your walls. Every fix the institution knows how to make was built for the human-delegated version. The tools to extend them exist. The owner accountable for which extension gets made first does not. Until your institution has one named human responsible for what happens when a customer’s agent meets your platform, the tooling will keep arriving ahead of a process that hasn’t caught up.

What you can actually do about it

Three moves, in this order.

Name a single owner. Not “the agent steering committee.” A single named human accountable for what your institution’s answer is to “how does an AI agent act on behalf of our customer.” Without that, every team will solve the part they own and the resulting policy will not be coherent.

Decide your defaults before vendors decide for you. Most institutions will end up accepting a class of agent behavior by default because they didn’t write a policy that excluded it. Write the policy first. If you don’t know what you want to allow, lead with a small, low-risk set, expand deliberately, and require explicit consent at each expansion. The institutions that decide later will be the institutions that get the defaults the largest agent vendors wanted.

Engage the standards conversation. This is the one a CISO usually doesn’t think is their job. It is. The authentication primitive your customers’ agents present at your edge is going to be a standard. You can be in the room when it’s written, or you can be on the receiving end of someone else’s answer. The institutions that don’t show up to the FIDO and OpenID conversations on agent identity are going to spend the next decade integrating with a primitive someone else designed.

The tooling helps. The tooling does not substitute.

Back to the question

The framing my friend kept returning to: we can welcome these agents. We can try to deny them. What we can’t do is nothing. They are coming whether we have a policy or not, and the version where they show up and we have nothing waiting is the version that ends badly. Better to be ready.

He was not asking for technology. He was asking for the standard, the policy, and the owner. He was asking for the category.

The number of customer agents arriving at your platform is going to grow by some multiple this year. The number of slots in your stack for what those agents are, and what they are allowed to do, is going to grow by zero, unless you build them.

If you’re working through this and want a thought partner on what the category should look like, Agent Control Efficacy is where I’m starting — measuring whether your controls hold against the agents already in your environment.

About this piece

Your AI Agents Have No Badge, No Boss, and No Audit Trail: Your customers are about to show up at your platform with their AI agents. Your IAM was built for humans. Your fraud models were trained on humans. Now what?

What is this article about? Who wrote it? And what should you do with it? Your AI Agents Have No Badge, No Boss, and No Audit Trail. Your customers are about to show up at your platform with their AI agents. Your IAM was built for humans. Your fraud models were trained on humans. Now what?. Published June 2026 by Steve Curtis, a cybersecurity executive and operator. This article is part of the steve.curt.is newsletter on security integration, founder strategy, and the operator judgment calls behind running cybersecurity businesses at scale.

Topic: Bet 05 · Agent runtime governance. Last updated .

About the author

Steve Curtis
Cybersecurity executive with 20+ years across consulting (PwC, Accenture), vendor leadership (Palo Alto Networks), venture-backed operator roles (Cygnvs, Pangea / CrowdStrike, Staris AI), and independent advisory through Rencana. Former Global Managing Director of Accenture Security (1,800-person org, ~100X growth) and former SVP of Ecosystems for Prisma & Cortex at Palo Alto Networks.

Selected operator results (case studies)

Case study · Pangea → CrowdStrike (2024–2025)

Joined Pangea as Head of Business Development to lead the pivot to AI detection and response. Built the channel motion and partner ecosystem that positioned the company for acquisition. Eleven months later, CrowdStrike acquired Pangea for $260M as the basis of its AIDR offering.

Result: $260M strategic exit; product line became a named CrowdStrike offering.

Case study · Accenture Security (2013–2021)

As Global Managing Director, led the cybersecurity services P&L across Communications, Media, Technology, and Aerospace sectors. Scaled the business approximately 100X over eight years through delivery modernization, automation, and acquisition integration.

Result: 100X revenue growth; 1,800-person global organization; multi-hundred-million-dollar services portfolio.

Further reading