Skip to main content
June 9, 2026Anthropic Launches Claude Fable 5 with Runtime Fallback Safeguards and Mandat...
Back to newsletter

NEWSLETTER

You Approved One Agent. It Has Quietly Become Another.

June 9, 2026

Reaching Stage 3 of the Authorization Coverage Lifecycle proves an authorization was written. It does not prove the authorization still describes the agent running in production. Continuous tuning creates a new artifact gap: an intent statement with no behavioral-reconciliation date is a record that was true once.

The internal audit lead had the registry open on one screen and the board deck on the other. Stage 3. Coverage operating. Ninety-one percent of agents in the tenant carried a signed intent statement, a named owner, a review date inside ninety days. She had spent eight months getting that number above the line, and on Thursday she was going to report it. The agent she was looking at, the one that drafted renewal terms and routed them for counter-signature, had a clean record. Purpose stated. Scope bounded. Owner confirmed in writing. She closed the laptop. The record was complete. What the record described had changed four times since it was signed, and nothing in the registry had a field for that.

The question this edition answers is narrow and uncomfortable. When an agent revises its own behavior after the authorization was signed, what exactly is your signed authorization a record of?

At Build 2026, in early June, Microsoft introduced a new category of agent. In the Microsoft 365 Blog, they call them Autopilots, and the defining line is plain: these are always-on agents that work autonomously, with their own identity, and act on your behalf. The same post is precise about what changes. Autopilots stay active in the background, understand how work gets done across your apps and systems, and take action without needing to be prompted each time. Microsoft Scout is the first of them. Set that beside a second announcement from the same week. Frontier Tuning, described on the Microsoft 365 Developer Blog, applies reinforcement learning inside your compliance boundary, and the operative sentence is this: the system improves continuously as it learns from each interaction. Two capabilities, announced together at Build. One moves the work off the prompt and into an always-on loop. The other allows the agent's behavior to be continuously retuned inside your compliance boundary after you have authorized it.

Hold those next to a signed intent statement. The intent statement is a photograph. It records what the agent was, on the day a named human read its scope and signed. An agent under continuous tuning is not a photograph. It is a moving thing that was photographed once. The registry holds the photograph and reports it as current, because the registry was built to check whether a record exists, not whether the record still resembles the agent.

The authorization is a dated snapshot. The agent it describes kept moving. By the time anyone reads the record, it captures a version that no longer exists.

This is the part that the coverage ratio cannot see. You can reconcile every agent in your tenant, assign every owner, date every review inside ninety days, and reach Stage 3 honestly. The ratio measures whether the authorization was written. It does not measure whether the authorization still describes the thing running in production. An agent that retunes itself toward what works moves away from the signed scope without a single human action that anyone could point to later and call the decision.

Here is what someone who has sat through enough examinations knows that the press coverage does not mention. The dangerous gap is not the agent without a record. Everyone is hunting those; that is what Stage 1 and Stage 2 are for. The dangerous gap is the agent with a perfect record that no longer matches it, because that agent passes every check you have built. It shows green. It survives the internal audit. It clears the board report.

Every row checked, ninety-two percent green, and the one thing that went wrong has no row to check. The coverage ratio measures whether the authorization was written, not whether it still matches the agent.

And the day a regulator asks not "was this agent authorized" but "was the version that took this specific action the version you authorized," the signed statement on file answers a question that was no longer being asked.

The signed authorization answers confidently. It just answers a question the examiner is no longer asking: not whether the agent was authorized, but whether the version that acted was the one you authorized.

The outside expectation is already written, and it does not bend to a tuning loop. The EU AI Act, Article 14(4), requires that a deployer be able to decide, in any particular situation, not to use the high-risk AI system or to otherwise disregard, override or reverse the output, and to interrupt the system through a stop mechanism that brings it to a safe halt. That obligation assumes a deployer who knows what the system currently does. The May 1, 2026 Five Eyes guidance on agentic AI is blunter still, warning that agentic AI systems can take actions without explicit human approval, increasing the risk of insecure actions occurring without human oversight. A self-tuning, always-on agent is squarely within the risk profile both documents describe, and a signed intent statement from last quarter does not satisfy either one.

The Authorization Coverage Lifecycle places organizations by the ratio of governed agents to total agents, and most who believe they are in Resolution are reading a number that has quietly stopped meaning what they think. Reaching Stage 3 was supposed to prove authorization was current. With agents that revise themselves, currency needs its own test, separate from existence. The fix is not another framework. It is one field the photograph never needed and the moving agent cannot do without: the date and version the authorization was last reconciled against the agent's actual behavior, not the date the statement was signed. An intent statement with no behavioral-reconciliation date is the new artifact gap. It is a record that was true once. The full lifecycle, including how coverage decays even when the ratio holds, is here.

Microsoft built the always-on agent and the loop that lets it change. Both arrived the same week, to the same applause. Neither came with the field that tells you whether the agent you signed off on is still the agent in the room.

In your environment, the agents reporting green on the coverage ratio: when was any of them last checked against what it actually does now, rather than what its signed record says it was authorized to do then?