AI Accountability: Who Owns Every AI Decision | Babak Abbaschian

Adapted from the second part of my sessions at Dreamin’ Data, the Salesforce Data Cloud community conference.

In the first part of this series I argued that enterprise AI does not fail because the technology is weak. It fails when leadership does not hand the organization a concrete operating model for absorbing AI into real work: the workflows that change, the decision rights that move, the metrics that count, the governance that holds, and the old processes that get retired. That was the subject of Part I. Strategy.

This article moves one layer closer to the decision itself. Once AI is inside the operating model, accountability has to become specific. Leadership owns the operating model. Business owners own outcomes. Technical teams own system quality. Risk and compliance own the guardrails. And every consequential AI-supported decision needs a named human who can stand behind it, override it, reconstruct it, and stop the system when necessary.

THE SHARPER QUESTION

Not whether a human is in the loop. Who owns which decision, at which stage.#

It is tempting to say AI fails because nobody is accountable for the decision the model made. That is memorable, but it is not quite right, because not every system makes a decision. Some recommend, rank, score, draft, route, approve, or trigger an action that a human or another system then executes. The more rigorous statement is this:

AI fails when consequential decisions are influenced, recommended, or executed by a model without a named human accountable for the outcome.

And accountability is not one vague person somewhere in the workflow. It is a chain of named ownership that runs across strategic intent, workflow outcome, data readiness, model quality, risk boundaries, the moment of decision, runtime control, and the response after something goes wrong. Wherever a link in that chain is unnamed is exactly where the failure lands. The four cases below are not really stories about bad models. They are stories about missing links.

CASE 01 · AIR CANADA

The chatbot that became a defendant#

In November 2022, Jake Moffatt was grieving. His grandmother had just died and he needed to fly across Canada for the funeral. He asked Air Canada’s website chatbot about bereavement fares and was told he could pay full fare now and apply for the discount within ninety days of the flight. So he did. He bought the ticket, flew, and filed his claim. Air Canada denied it. The real policy, on a different page of the same website, said the opposite. The chatbot was wrong.

Moffatt sued. In front of the British Columbia Civil Resolution Tribunal, Air Canada argued that the chatbot was a separate legal entity responsible for its own actions. The tribunal member, Christopher Rivers, called it a remarkable submission, ruled that the chatbot is part of Air Canada’s website, and held the airline liable. The damages were small, around eight hundred Canadian dollars. The lesson was not. The missing link was at the moment of decision: no human inside the company owned what the bot told a customer, so when it failed, the company tried to argue the machine was its own legal person. You cannot outsource accountability to a chatbot.

CASE 02 · ZILLOW

The half billion dollar algorithm#

Zillow Offers used the Zestimate algorithm to value homes, make instant cash offers, and resell at a markup. In early 2021, an internal initiative called Project Ketchup made three changes. The Zestimate became the cash offer, with no more human review of price. The pricing experts who had calibrated the model for years were explicitly told to stop questioning its valuations. And to win sellers, Zillow sometimes raised the algorithm’s price by thousands of dollars.

Zillow bought roughly 7,000 homes, paying above market in places the algorithm believed were still rising. By the third quarter of 2021 it took a 304 million dollar writedown, with total losses past 500 million, laid off about 2,000 people, and watched its market value fall from 48 billion in February to 16 billion in November. The missing link was the runtime control layer. The humans who could have caught it were ordered to stop overriding the model. Remove that link from a high stakes loop and you do not get more efficiency. You get faster failure.

CASE 03 · THE NETHERLANDS

When an algorithm brings down a government#

Between 2005 and 2019, the Dutch tax authority used an automated system to flag fraud in childcare benefit claims, using inputs like dual nationality and foreign sounding names as risk indicators. Roughly 26,000 families, disproportionately immigrant and low income, were falsely accused and ordered to repay tens of thousands of euros, often in full and without payment plans. Parents lost homes, jobs, and in some cases custody of their children. In January 2021, the entire Dutch government resigned over the scandal.

The black box system resulted in a black hole of accountability, with the Dutch tax authorities trusting an algorithm to help in decision making without proper oversight.

Merel Koning, Amnesty International, 2021

Two links were missing at once. There was no owner who could explain a decision, because the civil servants doing manual review did not know why the algorithm had scored a family high risk, and there was no one watching for drift as the system ran for more than a decade. The cost was not a writedown or a lawsuit. It was a government collapse and 26,000 demolished lives, from a system nobody owned.

CASE 04 · 2024, AND WE ARE NOT LEARNING

The pattern repeats#

In March 2024, New York City launched MyCity, an AI chatbot to help small business owners navigate regulations. Within five months, journalists at The Markup found it telling landlords they could discriminate against Section 8 voucher holders, illegal in the city since 2008, telling employers they could take workers’ tips, and telling businesses they could lock out tenants. When confronted, the city added a disclaimer but kept the bot online, and when asked whether it could be used for professional advice, the bot still said yes.

In June 2024, McDonald’s ended a three year partnership with IBM on drive through ordering after viral videos showed customers charged for nine sweet teas they did not order, or bacon added to ice cream. In both cases the missing link was the same: nobody owned the runtime decision of whether the output was correct and legal enough to act on. The bot acted, the world executed, and the chain broke at the moment of decision. Same pattern in 2024 as a decade earlier. Smart people, real models, no clear owner.

THE PATTERN

Three failure modes show up in every case#

Across these four cases and dozens of others, three recurring failure modes appear. I call them the three D’s, and they map directly onto broken links in the chain.

• Defer. A human is in the workflow but is structurally prevented from overriding the machine. Zillow’s pricing experts. The Dutch reviewers signing off on scores they could not explain.

• Drift. The model is deployed once, certified once, then never re-evaluated as the world changes. Zillow’s algorithm met a once in a generation housing boom and nobody owned the monitoring of drift.

• Disown. When the system fails, the organization argues it is somebody else’s problem. Air Canada and the separate legal entity. The Dutch authority and sovereign immunity.

Defer. Drift. Disown. Those three words will explain a large share of the public AI failures we read about over the next five years.

THE ACCOUNTABILITY MAP

Who owns what, at which layer and stage#

Here is the refinement that the failures demand. A CEO cannot own every model output. A data scientist cannot own every business consequence. Legal cannot own every customer interaction. And a frontline operator cannot be accountable if they have no authority to override. So accountability has to be mapped to layers and stages, each with a named owner and a defined scope. This is the map I work from.

Layer	Accountable owner	What they own
Strategic intent	CEO or executive sponsor	Why the AI system exists and what outcome justifies it
Workflow outcome	Business or P&L owner	The business result, customer impact, adoption, and operational consequence
Data readiness	Data owner or steward	Data quality, access, lineage, freshness, and definitions
Model quality	AI, ML, or engineering owner	Evaluation, monitoring, drift, testing, and technical performance
Risk boundaries	Legal, compliance, security, risk	Prohibited actions, regulatory exposure, auditability, privacy, and security
Moment of decision	Named decision owner or authorized operator	Accepting, rejecting, escalating, or overriding the AI-supported decision
Runtime control	Human on the loop	Real time intervention, handoff, correction, and stopping the system
Post-incident response	Business, technical, and risk owners together	Remediation, root cause analysis, regulator response, and workflow change

Read the four failures against this map and the gaps are obvious. Air Canada had no owner at the moment of decision. Zillow severed runtime control by telling the experts to stop. The Dutch system had no owner who could explain a decision and no one watching for drift. The map is not bureaucracy. It is the difference between a system you can answer for and one you cannot.

THE OPERATING FRAMEWORK

The accountability stack: four controls on every system#

The map says who owns each layer. The stack says what has to be physically present around the system for those owners to do their job. Four controls, on every consequential AI system.

1. Decision rights. For every consequential decision the system influences, you can name the human who owns the outcome. Not the team. The human.

2. Override paths. The accountable human can override the model in real time, not three weeks later in a postmortem.

3. Decision logs. Every consequential decision is logged so you can reconstruct what the model said, what the human did, and why.

4. Kill switches. For every system, documented conditions under which it gets turned off, signed in advance by the accountable owner.

No new technology is required. It requires a leader to say: before this thing goes live, I want a name on every decision it touches, a way to override it in real time, a record of what it did, and a signed condition for turning it off.

A CORRECTION

Human in the loop is not enough#

There is a comforting phrase in every AI keynote, including some I have given: human in the loop. It suggests that as long as a person sits somewhere in the workflow, we are safe. We are not. The Dutch civil servants were in the loop, doing manual reviews, handed a black box that flagged cases with no explanation, so they signed off. That is what people do under pressure, with a target on their head, inside a system they do not understand. Zillow’s pricing experts were in the loop too. They were told to stop questioning it.

Human on the loop is the answer. A person with authority over the workflow, documented decision rights that survive sprint planning, real time visibility into model behavior, incentives aligned with overriding when needed, and a signed kill switch with pre-agreed triggers. In the loop versus on the loop is the difference between AI that creates value and AI that creates lawsuits.

IN PRACTICE

What it looks like when the chain is intact#

At Churchill Downs, where I led data strategy and analytics from 2019 to 2024, every model shipped with three attachments before it went into production: a decision owner with name and title, a reversibility plan written before launch, and a kill switch with conditions signed in advance. On that footing we built a Customer 360 platform across 32 subsidiaries and more than 80 million guests that contributed to measurable business impact across the enterprise. Our models were not better than Zillow’s. Our accountability was clearer.

At Voxr AI, where voice agents make real calls to consumers in regulated insurance and financial services adjacent markets, the stakes are legal. A wrong quote, a misleading promise, or an unsupported claim can become a consumer protection, insurance compliance, telemarketing, or unfair and deceptive practices issue depending on the product and jurisdiction. So every script change has a named human owner who signs off before it goes live, every call is observable and interruptible in real time, and every consequential outcome is logged so we can reconstruct what was said and who deployed that version. The model matters, but the scaffolding is the difference between a tool that performs and a system the company can answer for.

The same discipline can be designed into a product’s spine instead of bolted on after a lawsuit. In a sales coaching companion we are building, the representative owns the call and the AI only advises, every suggestion carries its reasoning, and every coaching moment is reconstructible. In the agentic assistant I run for myself, the governing principle is simple: authority should be earned by track record, bounded by risk, reversible in real time, and logged as evidence. An agent earns the right to act without asking only after a long, measured record of getting it right, and even then the action passes through a notification I can stop. The architecture is the accountability.

WHY THIS MATTERS NOW

As models start to act, the chain matters more, not less#

Everything above gets sharper as AI shifts from recommending to acting. Gartner forecasts that by 2028, at least 15 percent of day-to-day work decisions will be made autonomously by agentic AI, up from zero in 2024, and it also predicts that more than 40 percent of agentic AI projects will be canceled by the end of 2027 on escalating costs, unclear value, and inadequate risk controls. Read those two forecasts together. Autonomy is arriving fast, and most early attempts at it will fail for exactly the reasons described here. When a model only recommends, a human still stands between the recommendation and the world. When an agent acts, that buffer is gone. Decision rights become action rights, override paths have to operate at machine speed, and decision logs become the only way to reconstruct what happened. The accountability map and the accountability stack are not optional overhead in that world. They are the precondition for deploying at all.

The macro picture is why this is urgent. RAND’s 2024 report cites estimates that more than 80 percent of AI projects fail, roughly twice the failure rate of traditional IT projects. Gartner’s 2026 survey of 782 infrastructure and operations leaders found only 28 percent of AI use cases fully succeed and meet ROI, with one in five failing outright, and its 2024 forecast warned that at least 30 percent of generative AI projects would be abandoned after the proof of concept stage by the end of 2025. The bottleneck is not raw model capability. It is the missing owner. The common thread across these numbers is not that organizations lack ambition. It is that ambition is moving faster than ownership, controls, and operational accountability.

FOR MONDAY MORNING

Five questions for your team#

Leave with five questions. If the answers come back fuzzy, you have an accountability gap.

1. For each AI system in production, can I name the single human who owns the decision it influences?

2. Does that human have the technical ability to override the system in real time?

3. If we were subpoenaed tomorrow about a specific decision, could we reconstruct what the model saw, what it said, and who deployed it?

4. Do we have a written, signed kill switch, with agreed conditions, for every consequential model?

5. Are the accountable humans incentivized to override the model when it is wrong, or to defer to it because that is faster and the metrics reward speed?

If you cannot answer those five clearly, you do not have an AI strategy. You have an AI exposure.

THE CLOSE

From data, to decisions you can stand behind#

Most of the AI industry is obsessed with the first half of the phrase: better data, bigger models, lower latency, smarter retrieval. All of it matters. None of it is the bottleneck. The bottleneck is the second half. Decisions that humans can trust, act on, stand behind, and answer for.

The Dutch government did not collapse because the algorithm was wrong. Algorithms are wrong some percentage of the time. It collapsed because no link in the chain was named. Nobody owned the decision to use the system, nobody could explain it, and nobody could stop it.

Trust is not a feature. It is a property of the accountability chain you build around the model.

Part I asked leaders to deliver a concrete operating model. Part II asks them to put a name on every decision inside it. Build the map. Wire the stack. Name the owners at every layer and stage. Then deploy.

Babak Abbaschian is CTO and Cofounder of Voxr AI and Board Member and AI Strategy Chair at Noble Cortex. He previously led data strategy and analytics at Churchill Downs Incorporated. Reach him at babak@abbaschian.com or on LinkedIn.

Sources and references#

Cases accessed May 12, 2026. Industry statistics re-verified May 22, 2026.

Case 1: Air Canada chatbot ruling (Moffatt v. Air Canada, 2024 BCCRT 149)

• British Columbia Civil Resolution Tribunal decision summary, American Bar Association. https://www.americanbar.org/groups/business_law/resources/business-law-today/2024-february/bc-tribunal-confirms-companies-remain-liable-information-provided-ai-chatbot/ (accessed 2026-05-12).

• CBC News reporting on the case. https://www.cbc.ca/news/canada/british-columbia/air-canada-chatbot-lawsuit-1.7116416 (accessed 2026-05-12).

• McCarthy Tetrault legal analysis of negligent misrepresentation. https://www.mccarthy.ca/en/insights/blogs/techlex/moffatt-v-air-canada-misrepresentation-ai-chatbot (accessed 2026-05-12).

Case 2: Zillow Offers shutdown

• Zillow Group Q3 2021 SEC Form 8-K filing (official writedown disclosure). https://www.sec.gov/Archives/edgar/data/0001617640/000161764021000085/q32021991.htm (accessed 2026-05-12).

• Stanford Graduate School of Business case analysis. https://www.gsb.stanford.edu/insights/flip-flop-why-zillows-algorithmic-home-buying-venture-imploded (accessed 2026-05-12).

• Journal of Information Systems Education on Project Ketchup and pricing experts. https://jise.org/Volume35/n1/JISE2024v35n1pp67-72.pdf (accessed 2026-05-12).

Case 3: Dutch childcare benefits scandal (toeslagenaffaire)

• Amnesty International report, Xenophobic Machines, with the Merel Koning quote. https://www.amnesty.org/en/latest/news/2021/10/xenophobic-machines-dutch-child-benefit-scandal/ (accessed 2026-05-12).

• Wikipedia summary with primary source links and the 26,000 families figure. https://en.wikipedia.org/wiki/Dutch_childcare_benefits_scandal (accessed 2026-05-12).

• Lighthouse Reports investigation, The Algorithm Addiction. https://www.lighthousereports.com/investigation/the-algorithm-addiction/ (accessed 2026-05-12).

Case 4: NYC MyCity chatbot and McDonald’s / IBM drive through

• The Markup investigation, NYC AI Chatbot Tells Businesses to Break the Law. https://themarkup.org/artificial-intelligence/2024/03/29/nycs-ai-chatbot-tells-businesses-to-break-the-law (accessed 2026-05-12).

• CNBC reporting on the end of the McDonald’s and IBM partnership. https://www.cnbc.com/2024/06/17/mcdonalds-to-end-ibm-ai-drive-thru-test.html (accessed 2026-05-12).

Industry statistics on AI project failure and adoption

• RAND Corporation, Why AI Projects Fail (by some estimates, more than 80 percent fail). https://www.rand.org/pubs/research_reports/RRA2680-1.html (accessed 2026-05-22).

• Gartner, AI projects in I&O stall ahead of meaningful ROI (28 percent fully succeed, 782 leaders, surveyed Nov to Dec 2025). https://www.gartner.com/en/newsroom/press-releases/2026-04-07-gartner-says-artificial-intelligence-projects-in-infrastructure-and-operations-stall-ahead-of-meaningful-roi-returns (accessed 2026-05-22).

• Gartner, at least 30 percent of generative AI projects abandoned after proof of concept by end of 2025 (2024 forecast). https://www.gartner.com/en/newsroom/press-releases/2024-07-29-gartner-predicts-30-percent-of-generative-ai-projects-will-be-abandoned-after-proof-of-concept-by-end-of-2025 (accessed 2026-05-22).

• Gartner, over 40 percent of agentic AI projects will be canceled by end of 2027, with at least 15 percent of day-to-day work decisions made autonomously by agentic AI by 2028. https://www.gartner.com/en/newsroom/press-releases/2025-06-25-gartner-predicts-over-40-percent-of-agentic-ai-projects-will-be-canceled-by-end-of-2027 (accessed 2026-05-22).

Discuss this on LinkedIn

From Data to Decisions: Accountability at Every Layer and Stage