Rational Partners
A fragmented view of a house plant as a visual metaphor for AI and biological data, by Alan Warburton

Plant by Alan Warburton is licensed under CC BY 4.0.

AI Assurance Is Not AI Audit: What Boards and Investors Actually Need.

Roja Buck

The AI assurance industry is being built at speed. The Big Four are racing to claim the space. PwC has announced "Assurance for AI" aligned to IAASB standards. Deloitte has its Trustworthy AI framework. KPMG became the first Big Four firm to achieve ISO 42001 certification for AI management systems. EY has invested over a billion dollars in assurance technology. NIST has published its AI Risk Management Framework, and the FTC is expanding its focus on AI systems that affect consumers.

All of this is necessary. None of it answers the question that actually keeps boards and investors awake at night: will this AI transformation deliver?

That is not a compliance question. It is a delivery question. And the emerging AI assurance industry, as currently configured, has almost nothing to say about it.

88% of organizations now use AI, but only 6% generate meaningful EBIT impact. The compliance audit passes. The transformation still fails.

What AI Assurance Actually Means

Frameworks for AI assurance are proliferating. NIST's AI Risk Management Framework describes a structured approach to identifying, assessing, and managing AI risk across four core functions: Govern, Map, Measure, and Manage. The scope of what these frameworks call "AI assurance" is a portfolio of techniques: bias audits, impact assessments, conformity assessments, performance testing, and formal verification. It is fundamentally about whether your AI system is safe, fair, transparent, and governed. These are important properties. An AI system making decisions about credit, hiring, or medical treatment absolutely must be evaluated for fairness and accountability.

But notice what this definition does not include. It says nothing about whether the AI program will deliver business value. Nothing about whether the data foundations can support the use cases in the business plan. Nothing about whether the team has the capability to build and maintain what has been promised. Nothing about whether the transformation will generate a return on investment within any reasonable timeframe.

The frameworks define AI assurance as being about the trustworthiness of AI systems. That is one dimension of a much larger problem.

What the Big Four Are Selling

The major accounting and consulting firms have moved quickly to claim the AI assurance market. Their offerings are sophisticated and, within their scope, genuinely valuable. But it is worth understanding exactly what they cover, because the gaps matter.

PwC has launched Assurance for AI, designed to provide ISAE 3000-aligned assurance over AI governance frameworks and controls. This is audit-grade assurance over whether an organization's AI governance meets defined standards. It answers the question: are the right controls in place?

Deloitte frames AI assurance through its Trustworthy AI framework, built around seven dimensions: transparent, explainable, fair, robust, privacy-preserving, safe, and responsible. This is a comprehensive governance lens, evaluating whether AI systems meet ethical and operational standards across multiple axes.

KPMG has invested in model validation, control testing, and formal certification. In December 2025, KPMG International became the first Big Four firm to achieve ISO 42001 certification for AI management systems, and they have expanded their AI assurance capabilities to include AI trust services.

EY has invested over $1 billion in assurance technology and is building AI-specific assurance tooling across its global practice.

These are credible, well-resourced offerings from firms with deep expertise in governance, compliance, and audit. They answer a genuine set of questions: Is this AI system governed? Is it fair? Is it transparent? Does it comply with emerging standards, including the NIST AI RMF and applicable FTC guidance?

Here is the question none of them answer: will this AI program actually deliver business value?

That is not a criticism. Compliance assurance and delivery assurance are fundamentally different disciplines. The Big Four are exceptionally good at the first. The second requires a different kind of expertise entirely.

Zillow's algorithm worked exactly as designed. It passed every technical validation. It cost the company $500 million. A compliance audit would not have caught it.

The Delivery Gap

The data on AI program failure is now overwhelming, and it tells a story that compliance assurance cannot address.

McKinsey's 2025 State of AI report found that 88% of organizations have adopted AI in at least one business function. That sounds like progress. But the same survey found that only 6% of organizations have generated meaningful EBIT impact from their AI investments. The vast majority are spending money on AI with little to show for it at the bottom line.

RAND Corporation research found that approximately 80% of AI projects fail, roughly twice the failure rate of non-AI IT projects. The researchers attributed this not to poor model governance but to misaligned objectives, inadequate data infrastructure, and a failure to integrate AI outputs into existing workflows.

S&P Global's 2025 survey reported that 42% of companies had abandoned most of their AI initiatives, despite continued investment and executive enthusiasm. Rapid adoption followed by rapid disillusionment.

Gartner predicted that 30% of generative AI projects would be abandoned after proof of concept by the end of 2025, and separately forecast that 60% of AI projects will be abandoned by 2026 due to data quality, inadequate risk controls, or escalating costs. Their April 2026 research found that 57% of leaders whose AI projects stalled or failed said they "expected too much, too fast."

Consider Zillow's iBuying program. As Stanford's analysis documented, the algorithm worked as designed. It performed within its technical parameters. A model governance audit would have found it well-engineered and properly governed. The problem was not the model; it was the program around the model. Zillow lost $500 million and exited the business entirely.

The pattern is clear. The failure mode for AI programs is not ungoverned models. It is ungoverned transformations. Teams that lack the data foundations to support their ambitions. Organizations that underestimate the change management required. Leaders who expect productivity gains in months when the realistic timeline is quarters or years. Business cases built on hope rather than evidence.

A compliance audit will not catch any of this. It is not designed to.

Two Kinds of Assurance

This is the distinction that the current market conversation is missing. There are two fundamentally different questions that boards and investors need answered about AI, and they require two fundamentally different kinds of assurance.

Compliance assurance asks: "Is this AI system trustworthy?" It evaluates governance, fairness, transparency, safety, and regulatory alignment - including alignment with the NIST AI RMF, FTC guidelines on algorithmic transparency, and sector-specific requirements from the SEC or healthcare regulators. This is the Big Four's territory, and they are well equipped to deliver it.

Delivery assurance asks: "Will this AI transformation deliver business value?" It evaluates data readiness, team capability, strategic alignment, workflow integration, change management, and the realism of the business case. This is operator territory, requiring people who have personally built and shipped production AI systems, managed the organizational change that AI demands, and seen firsthand why transformations succeed or fail.

Most organizations are only getting the first kind. They are investing in governance frameworks, appointing AI ethics boards, engaging Big Four firms to audit their model pipelines. All of which is necessary. But it addresses perhaps 20% of why AI programs fail.

You need both. A compliance assurance that passes but a transformation that fails is not a successful outcome. An AI system that is fair, transparent, and well-governed but delivers no business value is an expensive governance exercise.

McKinsey found that high-performing organizations redesign workflows around AI. 55% of them do this systematically, versus just 20% of low performers. The transformation is the hard part, not the model.

What Delivery Assurance Looks Like in Practice

If compliance assurance is about the AI system, delivery assurance is about everything around the AI system. It is the difference between asking "does this model work?" and asking "will this program deliver?"

Here is what a rigorous delivery assurance assessment covers.

Data readiness, not just data quality. Most organizations conflate these. Data quality is a property of individual datasets. Data readiness is whether your data foundations, as a whole, can support the AI use cases in your business plan. A company may have high-quality customer data but no way to link it to operational data, no data governance framework, and no data engineering capability to build the pipelines that AI models need. The models are ready. The data is not.

Team capability, honestly assessed. Can your team build and maintain what has been proposed? This is not about whether they have experimented with ChatGPT. It is about whether they have the engineering depth to build production AI systems, the MLOps maturity to deploy and monitor them, and the organizational knowledge distribution to avoid critical key-person dependencies. The METR study found that developers consistently overestimate their own AI productivity by a significant margin. Self-reported capability assessments should be treated with corresponding skepticism.

Strategic alignment with business outcomes. Is the AI program connected to measurable business results, or is it a technology initiative looking for a business justification? The 57% of leaders who told Gartner they "expected too much, too fast" were not suffering from poor model governance. They were suffering from a disconnect between AI capability and business reality. A delivery assurance assessment maps each AI initiative to specific business outcomes, validates the assumptions in the business case, and pressure-tests the timeline.

Workflow integration. McKinsey's research on AI transformation found that high-performing organizations do something fundamentally different from low performers: they redesign workflows around AI rather than layering AI onto existing processes. 55% of high performers systematically redesign workflows, compared to just 20% of low performers. This is perhaps the most underestimated factor in AI program success. Dropping an AI model into an existing workflow and expecting transformation is like installing a jet engine on a bicycle. The engine works. The system does not.

Change management and incentive structures. AI transformation changes jobs. Sometimes it eliminates them. The incentive problem is real and almost universally ignored: a CTO whose team will shrink by 30% if AI adoption succeeds has no rational incentive to make it succeed. A sales director whose commission structure rewards activity, not outcomes, has no rational incentive to adopt AI-driven lead scoring that reduces activity but improves conversion. Delivery assurance surfaces these incentive misalignments before they sabotage the program.

Realistic expectations. This may be the simplest and most valuable element. Is the timeline realistic? Are the cost assumptions defensible? Has the organization accounted for the learning curve, the integration work, the data preparation, and the inevitable setbacks? Gartner's finding that 57% of failed projects suffered from unrealistic expectations is not surprising to anyone who has delivered AI in production. It is surprising to boards who have been presented with vendor demonstrations and business cases built on best-case scenarios.

We assess these dimensions using our 5P Framework and AI readiness methodology, which maps AI capability across people, process, product, platform, and protection. The output is not a maturity score. It is a specific, actionable assessment of where the program will succeed, where it will struggle, and what needs to change.

What Boards and Investors Should Be Asking

If you sit on a board or an investment committee, you are almost certainly hearing about AI. You may have already approved significant AI investment. The question is whether you are asking the right questions about it.

Most boards ask compliance questions: "Is our AI ethical? Are we compliant with FTC guidelines? Do we have an AI governance framework?" These are valid and important questions. They are also insufficient.

Here are the questions that delivery assurance answers, the questions that determine whether the AI investment will actually generate a return.

What is the specific business case for each AI initiative? Not "AI will improve efficiency" but "this specific AI application will reduce customer onboarding time from 14 days to 3 days, saving $2.1 million annually in operational costs, with an 18-month payback period." If the team cannot articulate this level of specificity, the business case does not exist.

What are the KPIs, and who owns them? AI programs without measurable outcomes are technology experiments, not business transformations. Every initiative should have defined metrics, a baseline measurement, a target, and a named person accountable for delivery.

Who is the business translator? McKinsey's research consistently identifies a critical role in successful AI transformations: someone who can bridge AI capability and domain-specific business problems. Not a data scientist who understands the business. Not a business leader who understands AI. Someone who genuinely inhabits both worlds. If this person does not exist in the organization, the AI program will produce technically impressive outputs that nobody uses.

What is the realistic ramp period, and does it match the board's expectations? If the CEO is expecting productivity gains in Q2 and the realistic timeline for meaningful impact is Q4, that gap will create pressure that distorts decision-making. Better to know now.

Can the team sustain this after the consultants leave? This may be the most important question of all. An AI program that depends on external consultants or a single internal specialist for its continued operation is not a transformation. It is a dependency. Delivery assurance evaluates whether the organization is building the internal capability to own and evolve its AI systems independently.

For a structured approach to these questions, our AI readiness assessment guide covers the methodology in detail. For boards evaluating AI claims in the context of a transaction, our AI due diligence guide addresses the investor-specific lens.

The US Regulatory Context

Understanding the regulatory landscape matters because it shapes what kind of assurance organizations prioritize, and the US sits in a distinctive position.

The US has taken a sector-led, largely voluntary approach to AI regulation at the federal level. The NIST AI Risk Management Framework (AI RMF 1.0) provides a voluntary framework, while sector-specific regulators - the SEC on algorithmic trading and disclosures, the FTC on consumer protection and algorithmic fairness, the CFPB on credit decisions, FDA on medical AI, and EEOC on hiring tools - are applying existing authority to AI-specific concerns. Executive orders have directed agencies to develop AI-specific guidance within their mandates, but there is no single comprehensive federal AI law equivalent to the EU AI Act.

The EU AI Act, however, has extraterritorial reach. Any US company that places AI systems on the EU market, or whose AI outputs affect people in the EU, falls within its scope. High-risk AI obligations begin in August 2026, and US companies operating in European markets need to be prepared.

ISO 42001, the international standard for AI management systems, is rapidly becoming the governance baseline internationally, and enterprise customers and sophisticated investors are increasingly treating it as a proxy for AI operational maturity. KPMG's early certification signals the direction of travel.

The US therefore operates with significant regulatory flexibility domestically but faces the EU's prescriptive framework for any international operations. For boards and investors, this means compliance assurance matters - but the regulatory environment alone will not tell you whether your AI program is working. The gap between minimum compliance and successful transformation is where most of the value, and most of the risk, resides.

Where This Leaves Boards and Investors

Both kinds of assurance matter. Pretending that compliance covers the full picture is as dangerous as ignoring compliance entirely.

If you need compliance assurance, the Big Four are well equipped. They have the frameworks, the certifications, the audit methodologies, and the regulatory expertise. For governance, fairness, and regulatory alignment, they are the right choice.

If you need to know whether your AI transformation will actually deliver, you need operators who have built and shipped production AI systems, who have managed the organizational change that AI demands, who have seen firsthand why programs succeed and fail, and who can tell you honestly whether your business case is realistic. That is a different skill set from audit and governance. It requires practitioners, not auditors.

The organizations that will succeed with AI are the ones that invest in both. Governance without delivery is expensive compliance theater. Delivery without governance is reckless. The combination - compliance assurance and delivery assurance working together - is what boards and investors actually need.

If you are evaluating your organization's AI readiness, our AI enablement services and AI readiness assessment methodology provide the delivery assurance lens. For teams building AI capability, our AI bootcamp program bridges the gap between experimentation and production delivery.

Frequently Asked Questions

References

  1. McKinsey & Company. The State of AI in 2025. QuantumBlack (2025).
  2. McKinsey & Company. The AI Transformation Manifesto. McKinsey Digital (2025).
  3. RAND Corporation. Identifying and Mitigating the Risks of AI Project Failure. RAND Research Reports (2024).
  4. S&P Global. AI Experiences Rapid Adoption but with Mixed Outcomes. S&P Global Market Intelligence (2025).
  5. Gartner. Gartner Predicts 30% of Generative AI Projects Will Be Abandoned After Proof of Concept. Gartner Newsroom (2024).
  6. Gartner. Lack of AI-Ready Data Puts AI Projects at Risk. Gartner Newsroom (2025).
  7. Gartner. AI Projects in Infrastructure and Operations Stall Ahead of Meaningful ROI Returns. Gartner Newsroom (2026).
  8. NIST. AI Risk Management Framework (AI RMF 1.0). National Institute of Standards and Technology (2023).
  9. PwC. PwC Launches Assurance for AI. PwC Newsroom (2025).
  10. Deloitte. Trustworthy AI Framework. Deloitte (2024).
  11. KPMG. KPMG Expands AI Trust Services. KPMG Newsroom (2025).
  12. KPMG. KPMG International First to Attain ISO 42001 Certification. KPMG Newsroom (2025).
  13. Stanford Graduate School of Business. Flip Flop: Why Zillow's Algorithmic Home-Buying Venture Imploded. Stanford GSB Insights (2022).

Need to know if your AI transformation will actually deliver?

We assess AI programs from the operator's perspective: data readiness, team capability, workflow integration, and business case realism. Compliance assurance tells you whether your AI is governed. We tell you whether it will work.