Sit through enough AI strategy meetings at a UK insurer or bank and the conversation starts to fall into a recognisable shape. The vendor presents. The slides are good. The use cases are tight – claims triage that halves cycle times, fraud models that catch what humans miss, pricing that holds up to actuarial scrutiny. The CFO asks reasonable questions and gets reasonable answers. The board signs off.

Then, somewhere between months three and six, someone from the data team puts a slide in front of the steering group that wasn’t in the original deck. It is never framed as bad news – data teams have learned to phrase these things carefully – but the substance is clear enough. The models cannot see what they need to see. The historical claims data the system was meant to learn from is scattered across a combination of physical files, microfiche, image-only PDFs without an OCR layer, and a string of legacy databases inherited through acquisitions that were never properly merged. The pilot will run. The numbers will look promising on the slice of data that happens to be clean. But the production system the board approved is going to underperform the business case, and the data team can already see by how much.

This is not, in 2026, an unusual story. MIT’s Project NANDA research, widely reported during 2025, found that 95% of organisations deploying generative AI saw zero measurable financial return ^[1]. S&P Global Market Intelligence’s 2025 enterprise survey came at the problem from a different angle: 42% of companies abandoned most of their AI initiatives that year, against 17% the year before ^[2]. Informatica’s 2025 CDO Insights survey provided the diagnosis: 43% of chief data officers named data quality and readiness as their single biggest obstacle to AI success, while only 12% reported having data of sufficient quality and accessibility for AI applications ^[3].

None of those figures is a model problem. They are a map problem.

A map problem, not a model problem

The analogy that keeps surfacing in these conversations is the self-driving car. The car can be the most brilliant state-of-the-art engineering marvel; it will still drive into a hedge if the underlying digital map is wrong, incomplete, or in the wrong format. The map is not an accessory. It is the thing the car uses to know where it is.

AI claims systems and the historical archives they depend on stand in the same relationship. A model is only as good as what it can see. If thirty per cent of an insurer’s relevant historical claims sit in physical files in a third-party storage facility, the model cannot see them. They might as well not exist. It will learn confidently from the seventy per cent it can read and apply that learning to one hundred per cent of incoming claims, and nobody on the production team will know which territories it is blind to until the loss ratios begin to tell them.

The cost of confident decisions on incomplete data

The risk here is not symmetric with the pre-AI baseline. An organisation handling claims manually on patchy historical data is also producing imperfect decisions, but human judgement is itself a correction mechanism. A claims handler who sees something that does not fit will query it, escalate, or apply experience to it. A model does not do that. It applies the pattern it has learned to whatever input arrives, at scale and at speed, with no internal flag for “this might be wrong.” The same incompleteness that produces occasional errors in a manual process produces consistent, confident, systematic errors in an automated one. Applied to every decision the system makes, simultaneously, until something downstream finally notices.

The FCA appears to have noticed too. On 3 December 2025 it launched AI Live Testing, a supervised environment in which firms run AI tools in real market conditions under regulatory oversight ^[4]. The first cohort included NatWest, Monzo, Santander and Scottish Widows; the second, announced in April 2026, added Barclays, Experian, Lloyds Banking Group and UBS, among others ^[5]. The Authority confirmed in September 2025 that it does not intend to introduce AI-specific rules, relying instead on Consumer Duty and the Senior Managers and Certification Regime to govern outcomes ^[6]. In other words: The firms deploying these systems remain accountable for what the systems produce, including for the consequences of the data the systems were trained on.

What “AI-ready” actually means for the archive

The characteristics of a data foundation an AI system can actually use are less mysterious than they tend to sound in vendor decks. The records have to cover the full relevant universe of products, books and time periods that the model needs to learn from; gaps in coverage become blind spots in the model. The same concepts have to be coded the same way across systems, which is harder than it sounds in an organisation that has grown through acquisition. The data has to be accurate in the dull, unglamorous sense of fields populated correctly and values entered as the underlying reality demands. And it has to exist in a form the system can read, which paper, microfiche and image-only scans without OCR layers are not.

Addressing all of that across an organisation with decades of accumulated archive material is a substantial piece of work, and the work is unglamorous in a way that AI investment is not. The board has approved an AI programme, not an archive programme. The CFO is being asked about model performance, not OCR throughput. The Authority, when it gets involved, asks about outcomes for consumers, not about the linear metres of files in storage that produced those outcomes. And yet the archive is, in most regulated financial institutions of any age, precisely where the bottleneck sits.

The conversation that changes the outcome

The most useful question a non-executive director can ask the CFO about an AI programme is not “is the model performing.” It is “what data is the model not seeing, and where is it.” The answer, in most cases, involves a third-party storage facility on a trading estate somewhere off a motorway, a quantity of material measured in linear metres or pallets, and a vague-but-significant percentage of the organisation’s historical record. That percentage is the gap between what the AI investment was meant to deliver and what it will actually deliver.

The conversation that changes the outcome of an AI programme is therefore not, in the end, a conversation about the AI. It is a conversation about the archive the AI is supposed to learn from – and about whether the work to make that archive readable to a machine has been scoped, funded and begun.

How Dajon makes the archive readable

Dajon Data Management has been doing this work for regulated financial institutions for some time, and the shape of it is now well-rehearsed. It begins with an assessment of what the archive actually contains and what state it is in. The digitisation programme follows: Scanning, OCR, classification, structured indexing, and integration with the systems that need to use the records. Phased delivery matters more than total throughput; prioritising the highest-value material first means the AI programme can begin learning from properly structured data long before the archive is fully through, which in turn means the business case starts returning before the digitisation budget is exhausted. The work is methodical and undramatic, which is exactly the point. The systems downstream depend on it being done that way.

The work that actually matters now

The AI investment has already been made in most organisations that needed to make it. The technology is real, the use cases are sound, and the regulatory framework is – with the FCA’s Live Testing programme now operating with two cohorts of major banks and insurers – more explicit about expectations than it has been at any point in the last decade. The thing standing between the investment and the return, in most cases, is not another vendor contract or another platform decision. It is the archive. And until the archive is in a state the AI can use, the business case the board approved will sit roughly where it is now: Plausible, defensible, and quietly underperforming.

The car is paid for. The map is in the boxes. The work to get it out of them is the work that actually matters now.

References

MIT report: 95% of generative AI pilots at companies are failing Fortune[↩]
Why most enterprise AI projects fail — and the patterns that actually work WorkOS[↩]
Why 90% of Enterprise AI Implementations Fail (2026) Talyx[↩]
FCA helps firms to test AI safely FCA[↩]
FCA announces second cohort for AI Live Testing FCA[↩]
AI Regulation in Financial Services: Turning Principles into Practice Bryan Cave Leighton Paisner[↩]

The Self-Driving Car Without a Map: Why AI in Regulated Finance Stalls at the Archive