By Tamara O’Brien, TMIL’s roving reporter
It’s not often that the worlds of football policing and corporate reporting coincide. But AI makes strange bedfellows of us all. And what’s come to light about a fairly obscure football match is a warning to any Powers That Be: trust and AI do not sit happily together.
Maccabi Tel Aviv F.C were due to play Aston Villa in Birmingham in a Europa League game last November. Acting on the basis of information about violence involving Maccabi fans at previous matches with Ajax and West Ham, West Midlands Police classified the Villa fixture as high risk, and Maccabi fans were banned from attending. This provoked hue and cry from all sides – from accusations of antisemitism, to groups approving the ban with varying degrees of vehemence.
The ban went ahead. But this week, two facts emerged. Firstly, the Dutch police said that while fans did clash after the Ajax game, West Midlands Police had overstated the violence of the Maccabi fans; they were more the targets of it than the perpetrators. And secondly, pertinent to our discussion – the West Ham match that was key in the decision to ban didn’t exist. AI made it up.
As a result, Chief Constable Craig Guildford has just announced his retirement from the Force. After weeks of denying it – including to the Home Affairs Select Committee – he finally admitted that Microsoft Copilot had indeed been used in compiling his report to the local Safety Advisory Group (which has the authority to impose a ban).
All of which leads me to say, in answer to the issue in question – yes! Disclose, disclose and disclose again. And take care about using AI in the first place. But for a more sensible, realistic and nuanced view, we turn to our panellists.
We began, as ever, with a reminder from Claire about why we’re all here. The point of reporting is to build trust between an organisation and its investors and other stakeholders, by providing two types of information. One type is accurate data and disclosures, in accordance with reporting requirements. The other is truthful commentary, which is the sincere opinion of the management and board as to what those disclosures mean for the company and its prospects.
Producing both, in the form of an annual or sustainability report, is a complex, lengthy, weighty, delicate and ultimately human undertaking. But also one that, with the march of regulation, is getting lengthier and more complex. So the appeal of AI is obvious – anything that can help the increasingly resource- and time-constrained reporter will get serious attention. But, given the nature of the task, the risks are also obvious – AI’s output, with its usual heavy sprinkling of generalisation, assumption, and fallacy, poses real threats to the accuracy and truthfulness of reporting.
Which is why investors et al might indeed want to know if, where and how AI has been used in the document on which they’re basing important decisions. But how do reporters even begin to tell them? Our panellists examined the challenge from their respective angles.
The nature of the beast
To give the debate some context, our tech panellist Diana began by summarising how the technologies work in large language models (LLMs) like ChatGPT, Microsoft Copilot and Claude. She identified four common key features:
They’re probabilistic systems that produce the most likely next word in a sequence, in response to the bundle of words you’ve put in your query.
They're not deterministic. What they do isn’t research; they have no concept of why one response may be right and another wrong. Which is why you might not get the same answer twice to the same question.
The model’s workings are not easily traceable. For its output to be valid, the user must intentionally instruct the model to list a source of evidence. (Which, Diana added, is what they particularly help companies with at Insig AI.)
They’re designed to please – by giving the ‘expected’ response, in language that our human brains find engaging.
Diana then expounded on what this means for reporting in terms of their capabilities, and the strengths and weaknesses inherent in each.
LLMs are incredibly versatile, able to generate information on any topic under the sun. But they’re not expert, insightful or capable of speaking from experience.
They’re tireless, incredibly fast at processing vast amounts of information, and can work as long and as iteratively as we want them to, in a way that’s impossible for a human. But as they lack common sense and the ability to discriminate between truth and falsehood, what they produce is untrustworthy, and they’re often unhelpfully verbose.
They support creativity in that they can be a great sparring partner, helping to spark ideas. But obviously, in a professional and regulated context like corporate reporting, accuracy and the integrity of information must take precedence over creativity.
They're designed to produce language that’s so natural, polished and convincing that it’s hard to tell whether it’s been written by AI or a person. AI detection tools exist, but can be gamed. And to Claire’s point – what are the consequences for us humans, when we outsource the tough but fulfilling parts of our work to machines?
To bring it all back to reporting, Diana referred to Insig AI’s analysis that fed into the paper jointly produced with Falcon Windsor: Your Precocious Intern – How to use generative AI responsibly in corporate reporting. Last year, their research showed that internal chatbots, corporate documents and so on were already being used as feedstock for annual reports, with tools like Copilot being used to draft, refine and proofread documents.
Since then, the genie has well and truly exited the bottle. ‘There are now specialised tools, both on the market and being developed in-house, to leverage generative AI specifically to prepare information, and also write sections of the annual report or sustainability statement.’
For Diana, this is the point at which reporters really need to get into the detail of the reporting process and decide which tasks they’ll use the technology for. Will it just be for low-level spell-checking… or for styling, proofreading, maybe editing text… or will you use it to generate text? Diana’s advice: ‘Thinking about how far users will be able to trust the information they’re given is a good place to start.’
The times they are a-changin’
Seasoned corporate reporting pro Martin is no stranger to AI. AstraZeneca has been using it for ages in its small-molecule pipeline. The LLM models, though, cast a whole new light on the company’s principle of ‘keep the human in the loop’… a principle, mused Martin, that the Chief Constable of West Midlands Police probably wishes he'd followed. (I wasn’t the only one to spot a timely anecdote then!)
When it comes to disclosure, Martin’s take was, what are we as corporate reporters supposed to be saying about the text in our AR? He told us about an experiment the reporting team at AstraZeneca carried out with their IT colleagues. They took two sections of their AR – one a therapy area review, the other a draft CEO review. They trained the AI tool on lots of press releases, quarterly earnings calls, past annual reports, notes to staff and so on, with very careful prompting.
‘What did we learn? Well, we got some superficially attractive and plausible summaries, especially when it came to medicines and clinical trials results in the therapy area review. But AI was much worse at strategy, or telling any kind of story, particularly in the CEO review. And when you looked closely, even the plausible summaries weren’t accurate. For example, the source material mentioned “1.3 billion patients”, which AI summarised as “billions of patients”. Not the sort of difference that’s acceptable in an annual report, or in a highly regulated organization like AstraZeneca.
‘Sometimes it was simply wrong, for example when it lifted a time-bound statement from the 2024 report and suggested we use it in the 2025 report. So is AI really saving you time, when you have to spend so long checking it?’
The output he saw would not pass any of their usual AR checks and balances. It could not be considered true, verifiable, fair, balanced or understandable. And it would certainly not survive scrutiny by specialist contributors, the senior executive team, the audit committee and the Board. At least, not yet – but no doubt one day it would be good enough. And so disclosing the use of AI becomes less about what impact it’s had on the text, but what it means for the governance and rigour of the existing processes that organisations use in producing their AR – and, indeed, how the use of AI fits into that. That would give assurance to report readers that AI is being considered, analysed and reported on as part of those processes, and as technology and regulations change.
As is Martin’s wont, he concluded on a musical note, and left us with the words of Bob Dylan ringing in our ears: “Come writers and critics who prophesy with your pen / And keep your eyes wide, the chance won't come again / And don't speak too soon, for the wheel’s still in spin….” You know the rest.
Not how, but why
Freddie also reframed the question. For him, a useful way to address users’ concerns is not so much about explaining how we’ve used AI in our annual report, but why.
‘The fundamental question is, why is using AI better than using a human? It comes down to what Claire was saying about the core concept of trust. As an investor, my number one question is, do I trust what this document is telling me, for me to be able to make decisions off the back of it?’
The answer has to be robust, because as Freddie pointed out, people have different thresholds of trust. ‘If I were to survey my colleagues, some of the more tech-savvy ones would be more accepting of AI, but there’d be a large group who’d be dismissive or concerned. So already there’s a challenge, in that not everybody shares the same view on the trustworthiness of AI, regardless of how it's used.’
Efficiency is often cited as a benefit of AI, but that’s hardly a benefit to stakeholders. ‘I'd be very sceptical if the reason given for using AI was “because it saved me time having to write the thing myself”,’ said Freddie. ‘That's not really a basis for trust.’
However, an explanatory approach is not just about avoiding negatives – there are positives to be gained too. ‘If a company made it clear that they use AI for interrogating large data sets, for example, to provide interesting insights, that's a very different matter. So perhaps what they should be considering is, how does our use of AI add value to the user of the accounts? Because if it doesn’t add value, or risks destroying value – then the basis of trust is diminished.’ Moreover, if companies were required to explain why they’d used AI, it would make them take the risks to concepts like accuracy, representativeness and reputation more seriously. Because concepts have a nasty habit of becoming realities.
Freddie concluded with a parting gift of two AI ‘tells’: which are, use of the em-dash; and, more recently, the word ‘quietly’, apparently.
So, with the soul-searching part of the webinar over, the focus switched to Matters Arising and questions from the floor. When Claire asked Martin about his use of AI, he was happy to endorse its use as an editing tool when writing to strict AR word counts. ‘I find it incredibly helpful when editing things down from, say, 155 words to 150. It's really good at that.’
I can relate. Nothing’s more frustrating or time-consuming than rejigging whole passages just to fit a box on a page or screen. Which, I hasten to add, is totally different from Freddie’s rightful condemnation of using AI to generate text. And just like that, we’re into the weeds of it, aren’t we? We’re going to need a whole new vocabulary just to describe the various acts and applications of writing, both human and electronic.
And now I really must wrap up, or my verbosity will have those AI-detecting antennae twitching. But much more emerged from the questions that followed… wondrous tales of CEOs frozen in time, of good news presented as bad, of the PhD student in your pocket… which you can hear all about by watching the playback.
As ever, the last word was left to our guests. What’s the future, asked Claire; AI all over our ARs?
Martin: Who knows where it’s going! Agentic AI?
Diana: Take it all with a pinch of salt, we’re still in the experimental phase. People are working hard to write tools that are beneficial. Keep experimenting and learning.
Freddie: Impossible to predict. In six months’ time it will all be different. Harness the core principle of trust as a grounding mechanism.
