Columnists30 June 2026 - 05:30

OKUMU: The AI improved the notes, not the patients

A celebrated study and a quieter trial reached different conclusions about the same tool

by NICHOLAS OKUMU

Audio By Vocalize

Let me show you something. Once you learn to see it, you will see it everywhere, and you will never again be impressed by a number simply because it is large.

Almost a year ago, a number travelled around the world. An artificial intelligence tool called AI Consult, built by Penda Health and OpenAI and used by clinicians across 15 Nairobi clinics, was reported to have cut diagnostic errors by 16 per cent and treatment errors by 13 per cent across nearly 40,000 visits.

The company published it on its own platform. The technology press called it the largest study of clinical AI ever run. For a country stretching every shilling of the Social Health Authority, it sounded like rescue.

Now a second study has arrived, quieter than the first. The same tool, the same clinics, but this time a carefully designed, pre-registered, randomised trial involving more than 9,000 patients, prepared for peer review in Nature.

It found that the tool made no measurable difference to whether patients actually recovered. Treatment failure within 14 days was the same, within the bounds of chance, whether the clinician had the AI or not.

Two studies. The same tool. Opposite headlines. So here is the question I want to teach you to ask: What exactly did each one measure?

This is the whole lesson, so slow down with me here. The first study measured the notes. Reviewers read the clinical documentation and judged whether it contained errors. The second study measured the patient. Did the person who walked in sick walk out well? These are not the same thing, and the gap between them is where most of our confusion about technology hides.

Think of a tutor you hire for your child. After a term, the exercise books are neater, the headings underlined, the working laid out beautifully. You are impressed. But the marks have not moved. The child is no better at mathematics. The tutor improved the record of the work, not the work itself. That is close to what happened here. The AI made the notes better. It did not, on the evidence, make the patients better.

Now, before you decide who to be angry with, let me be fair, because a good teacher is fair. This is not a story of dishonest scientists. It is the opposite. The trial was led by Kenyan researchers, approved by Kenyan ethics committees, run under our own data protection law and judged by Kenyan physicians.

The same people who produced the exciting headline also produced the sober correction and told us plainly about every interest they held. That is science behaving exactly as it should. And the tool was not worthless. It did improve documentation, and it trimmed antibiotic costs a little, enough in this trial to pay for itself. Hold on to that. It matters.

So if the scientists were honest and the tool does something, where is the difficulty? Here is the harder lesson, and it is the one worth carrying.

Power does not need to lie to you. It only needs to choose which true thing you hear, and to keep what is valuable once you have finished helping.

Watch how that works. The exciting study, built on the softer measure, arrived with a technology giant behind it and travelled the globe. Notice that it had not yet passed peer review. It was released as a preprint, posted online beside the company’s announcement, and that did nothing to slow it down.

The careful study, built on the harder measure, is the one now made to wait for the slow scrutiny of a journal, and it will reach far fewer people, because a correction never runs as fast as a claim. So the cheerful result reached the world before review, and the careful one arrives after it. There is no plot in that. It is simply how attention works, faster than scrutiny, and worth knowing.

The 16 and the 13 have already become received fact. A respected Stanford and Harvard review this year lists Kenya, without qualification, as a place where AI reduced clinical errors. The headline becomes the memory. The correction becomes a footnote that is easy to miss.

Watch, too, what is kept. The trial ran on our clinicians, our patients, our clinics, funded from abroad, powered by a model whose inner workings are closed and owned elsewhere. The paper is honest about what follows. The product can be bought, on its owner’s terms, for a setup fee and a monthly charge. We supplied the proving ground and the valuable sentence that the tool works in a place like ours. The owner keeps the model and the value. We license it back.

I am not teaching you to fear this technology. I have spent much of my own working life arguing that, used with care, these tools can carry scarce expertise to places that have none. I am teaching you to read it. There is a difference between a tool that helps and a headline that sells, and it is worth learning to tell them apart before we spend.

So keep three questions where you can reach them. When someone shows you a glowing result, ask first whether they measured the paperwork or the patient. Ask second whether they were looking hard enough to have found harm, and not only benefit. And ask third, the one we forget most often: when we have given the patients and the proof, what do we own at the end.

That is the lesson. The number that travelled the world was 16 per cent. The number that mattered was zero. Learn to see the distance between them, and you will not be sold the next one so easily.