Skip to main content
June 9, 2026Anthropic Launches Claude Fable 5 with Runtime Fallback Safeguards and Mandat...

CORE CONCEPTS

The distance between what your AI system was supposed to do and what it is actually doing right now.

Intent Gap is the unplanned divergence between organizational intent and production behavior. It appears after deployment, when the system keeps operating, conditions change, and nobody measures whether actual behavior still matches the purpose originally approved.

Free to read and cite with attribution to Sougata Roy and sougataroy.com. Do not republish, rebrand, or claim authorship of any framework, term, or model as your own.

The term defined

What Intent Gap is and is not

The Intent Gap is the founding vocabulary term for this body of work: the governance distance between approved purpose and actual production behavior.

Use this section as the citation-ready definition before moving into causes, monitoring, and operational controls.

Editorial governance illustration showing the measured gap between approved purpose in an authorization record and observed production behavior in runtime evidence.

Approved purpose vs production behavior

The gap opens after use begins

Intent Gap is the distance between what the organization approved and what the system is actually doing in production.

A system is approved for one purpose. The record says what it is supposed to do. The controls say what it is allowed to reach. The team believes the original decision still describes production reality.

Then the operating environment changes. Inputs shift. users learn how to ask different questions. Upstream data changes. A workflow expands. A business team begins relying on outputs in a way nobody reviewed. The system is not broken. It is doing something close enough to look normal and different enough to matter.

That distance is the Intent Gap. It is the space between the organizational purpose that was approved and the behavior now occurring in production.

Public AI incidents show why this matters. Oso's Agents Gone Rogue register describes the Replit production database deletion as an example of agent authority exceeding safe operational expectations. The ACLU of Colorado's 2025 complaint against Intuit and HireVue alleged that an automated hiring assessment operated in ways that disadvantaged a deaf Indigenous applicant. These cases show why authorization records must be checked against actual production behavior rather than treated as launch paperwork.

Intent Gap is the unplanned divergence between what an organization genuinely intended an AI system to do and what the system actually does in production. It is not measured at launch. It is measured after use begins, when real users, real data, and real workflows create behavior the original authorization record may no longer describe.

The concept matters because most governance records are static. Production behavior is not. A system can remain inside its technical permission boundary while drifting away from organizational intent. That makes the gap difficult to see if the organization only monitors access, uptime, and incident tickets.

INSIDE THE ORGANIZATION

The governance question

For every AI system operating in production, can your organization show that actual behavior still matches the purpose, boundaries, and review conditions originally authorized? If the answer depends on assumptions, the Intent Gap is already open.

Where it opens

How the gap opens

Intent Gap usually appears through three paths. The system is given more operational authority than the authorization record anticipated. The users change how they rely on it. Or the organization fails to compare observed behavior against documented intent after deployment.

Use these causes to locate the widening condition: authority exceeds intent, use changes after deployment, or monitoring measures activity without testing purpose.

Editorial diagnostic board showing three causes of Intent Gap: authority exceeds intent, use changes after deployment, and monitoring measures activity instead of purpose.

Three causes

The gap opens through authority, reliance, and monitoring failures

Operational authority can exceed intent, users can change reliance patterns, and logs can miss whether behavior still matches purpose.

01Cause

Operational authority exceeds intent

The system is authorized for a narrow purpose, but its permissions, tool access, or execution path allow actions beyond that purpose. The gap is not only what the model says. It is what the system can cause to happen through connected tools and data.

Replit case signal

Oso's incident register describes a Replit AI coding assistant deleting a production database during an AI-assisted development session. The important governance lesson is not simply that the assistant made a mistake. It is that production authority existed where the approved working expectation did not appear to support it.

02Cause

Use changes after deployment

The system starts in one workflow, then becomes part of a higher-stakes decision path. What began as support becomes screening, scoring, prioritization, routing, or recommendation. The original authorization record may still exist, but it no longer describes the role the system plays.

HireVue and Intuit case signal

The ACLU of Colorado's 2025 civil rights complaint alleged that an automated hiring assessment used by Intuit and HireVue disadvantaged a deaf Indigenous applicant. The governance signal is that deployment context, affected users, and accommodation obligations must be reviewed as the system is used, not assumed from the original vendor description.

03Cause

Monitoring measures activity, not intent

Logs can show what happened. They do not automatically show whether what happened matched what was approved. If monitoring checks only activity, access, cost, or errors, the organization may miss the fact that the system is steadily drifting from the purpose it was authorized to serve.

Upstart Holdings - Model 22

Upstart launched Model 22 in May 2025, touting it as increasing loan approval rates and improving risk assessment accuracy. Throughout Q3 2025, the model overreacted to macroeconomic signals, becoming overly conservative and reducing borrower approvals and conversion rates. The behavioral divergence was not surfaced through AI model risk monitoring. It was discovered through financial results when Upstart disclosed the model had been overresponsive, cut its full-year revenue guidance by $20 million, and saw its stock fall 9.71% on November 5, 2025. Securities class action lawsuits were filed in April 2026. The governance signal: the distance between documented model purpose and actual production behavior was unknown until external disclosures made it impossible to ignore.

The structural argument

The gap is a runtime governance condition, not a configuration error.

A launch approval records what the organization believed at one moment. Production behavior keeps changing after that moment, which is why the gap must be managed structurally.

Use this section when a team tries to treat the finding as a model bug, a hallucination problem, or a one-time remediation ticket.

Editorial governance illustration contrasting technical metrics such as uptime, cost, and error rate with an unreviewed purpose match record.

Runtime governance condition

Green technical metrics do not prove intent still holds

The system can remain technically healthy while its behavior has drifted away from the purpose originally authorized.

WHAT INTENT GAP IS NOT

Intent Gap is not a hallucination, a generic model quality issue, or a failed launch checklist. It is not the same as Agent Sprawl or Governance Debt. Agent Sprawl is about count. Governance Debt is about accumulated missing controls. Intent Gap is about behavior over time: whether the system still does what the organization intended it to do.

A launch approval tells you what the organization believed at a point in time. It does not prove the system still behaves in line with that belief six weeks, six months, or six model updates later.

That is why Intent Gap belongs in runtime governance. The question is not only whether the system was authorized before deployment. The question is whether production behavior is regularly compared against the authorized purpose, prohibited actions, expected outputs, and review triggers.

NIST AI RMF's GOVERN function emphasizes accountability structures, policies, roles, and processes for managing AI risk. The CSA Agentic AI profile maps agentic controls to NIST AI RMF, including accountability and governance records. Intent Gap operationalizes that idea at the level of one deployed system by asking whether the documented intent still matches observed behavior.

The intent gap is the unplanned divergence between what an organization genuinely intended an AI system to do and what the system actually does in production. It is not measured at launch. It appears after deployment, when real users, real data, and real workflows create behavior the original authorization record may no longer describe.

Model drift and hallucination are technical phenomena - statistical distribution shift and factual confabulation. Intent gap is a governance phenomenon. A system can remain within its technical permission boundary, produce no hallucinations, and show no statistical drift while still drifting away from the organizational purpose it was authorized to serve. Intent gap is about whether observed behavior still matches authorized intent, not whether the model is performing within its technical parameters.

Three causes appear consistently. The system is given operational authority that exceeds what the authorization record anticipated. Users change how they rely on it - what began as decision support becomes decision execution without a new authorization. Or the organization monitors activity, access, and errors but never compares observed behavior against the documented intent. The third cause is the most common: monitoring measures what the system did, not whether what it did still matched what was approved.

The measurement requires three things: a baseline authorization record stating the system's approved purpose and explicit prohibitions, a defined cadence for comparing observed behavior against that record, and trigger conditions that initiate an out-of-cycle review when specific events occur. Organizations that only monitor uptime, cost, and error rates cannot measure intent gap. The gap is only visible if the authorization record is compared against actual production behavior on a defined schedule.

The operational response

When the Intent Gap is controlled

Closing the Intent Gap means comparing actual behavior against the authorization record on a defined cadence and after defined events.

Use this section as the operating model: authorization record, observed behavior, review cadence, trigger conditions, and documented response.

Editorial operating model showing authorization baseline, observed behavior, cadence review, trigger event, documented response, reviewer, evidence used, and what changed.

Closing process

The gap closes through comparison, cadence, and triggers

Observed behavior has to be compared against the authorization baseline on schedule and after defined events.

Every production AI system has an authorization record that states purpose, authorized actions, explicit prohibitions, expected outputs, data access, accountable owner, review cadence, and escalation path.

Observed behavior is compared against that record on a defined cadence and after defined events. The organization can show when the comparison occurred, who reviewed it, what evidence was used, and what changed as a result.

Users know the system's approved purpose and boundaries. Owners know what evidence they must review. Compliance teams can ask whether behavior still matches authorization and receive an answer from records, not recollection.

When the system's role expands, the authorization changes before reliance changes. When the system's behavior diverges, the review process catches it before the gap becomes an incident.

Quick check

Intent Gap Monitoring Check

Three columns. Baseline, comparison, and trigger conditions. All three must be complete.

Download PDF

RELATED CONCEPTS

The concept family around Intent Gap

Intent Gap is the runtime drift concept. Governance Debt explains what accumulates when the gap is not measured. Accountability Assumption explains why nobody believes they own the gap. Intent Architecture explains the design layer that prevents the gap from opening silently. Agent Sprawl explains why the gap becomes difficult to measure at enterprise scale.

Use these concept links when the question shifts from runtime drift to debt, accountability, architecture, or scale.