There are strong hopes about the potential for AI to revolutionise health care systems, to increase their productivity and to improve patient outcomes. Earlier this year, the Prime Minister announced plans to “unleash AI” to boost productivity in public services, including the NHS.
Radiology diagnostics is one area in the NHS where there is optimism that AI, when working alongside human readers, will be able to reduce diagnostic error and improve early disease detection within existing workforce constraints. As the government rolls out its Analogue to Digital plans, 54% of NHS trusts in 2023 were using AI tools within radiology to interpret, prioritise or report on images, which is a statistic likely to have grown. Yet we still lack a thorough understanding of AI’s impact on patients, staff and wider hospital systems.
To address these gaps, the NIHR Rapid Service Evaluation Team (RSET) conducted a systematic scoping review of 140 studies in the international literature on AI in radiology diagnostics. We looked at the diagnostic accuracy and workflow efficiency of AI as well as its cost effectiveness, how staff and patients perceive and are experiencing AI, and how it is implemented in radiology. We also hosted workshops with radiology staff and the public to discuss our findings.
So what did we find? As AI continues to be scaled up nationally and the technology continues to evolve, here are three key messages that we think policy-makers need to consider.
The benefits are promising but caution is needed
To know whether AI is accurate in supporting diagnostic testing, it needs to correctly identify people with the disease (true positives), correctly rule out disease (true negatives), avoid misdiagnosing healthy people with disease (false positives), and, worst of all, failing to detect disease in those that have it (false negatives).
While there is some promising evidence that AI might improve diagnostic accuracy when used alongside a human reader (particularly for less experienced staff) and can improve workflow efficiency, the evidence is limited overall and is mixed across studies.
Our rapid review identified 25 studies that specifically measured AI’s ability to correctly identify signs of disease, of which 19 showed improvements. 13 out of 18 studies also showed that AI improved time to interpret images when used as an assistive diagnostic tool.
However, the evidence also shows that AI can increase the number of false positive cases, which may increase demand for diagnostic and treatment services further along the pathway and bring emotional burden for patients. Similarly, few studies measure hospital and pathway efficiencies beyond the immediate metrics such as reading and reporting time, so understanding the effect on longer-term pathway and patient outcomes remains unknown.
This balance between sensitivity (proportion of true positives) and specificity (proportion of false positives) is inherent in all diagnostic tests, and when sensitivity increases, specificity tends to decrease. While AI technologies may improve and become more accurate over time, policy-makers need to be mindful of the risks of false positives and the unnecessary worry this may cause for patients, as well as the possibility of AI increasing demand on already overstretched services.
One reason there is currently limited evidence is that most studies test AI within simulated workflows, rather than in real-world clinical use, and are therefore disconnected from the realities of local hospitals. This leads to uncertainty on how effective AI can be in making the improvements many hope it will.
Where are patients’ voices?
In our review, 74 studies explored what staff, trainees, patients and members of the public think about using AI for diagnostics in radiology. In most of these studies, AI was viewed favourably when used to assist diagnostics, but only 14 papers studied experiences of using AI in clinical practice – and none of these papers explored patient experiences of receiving care supported by AI. We don’t yet therefore know enough about patient experiences of AI-based care.
These gaps, however, raise important patient-related questions as AI continues to be implemented, namely:
- What do patients (including carers) know about AI and how these technologies can be used in their diagnostic care?
- Should patients be informed about AI and provide their consent? If so, how should the use of AI be communicated and how much information should be included?
- Do patients support and accept the use of AI? And in what capacity do they view AI favourably?
Implementation requires adequate resources and evidence
The effectiveness of AI ultimately depends on how it’s implemented, and it’s no small task to integrate such a complex intervention into an overstretched NHS with multiple and varied IT systems. Implementing AI demands significant resource – posing challenges to both health care budgets and to existing staff time.
A lack of clear guidance, a frequently cited barrier in the literature, means that organisations are not ready to support such technical advancements. The current implementation of AI was described as “the wild west” in our workshops, with no consistent approach or standard set of principles to support delivery. While variation and local adaptation will be inevitable, our findings do suggest that implementation and use of AI are currently outpacing the development of a robust evidence base on how it should be used.
Hold your horses
While many important unknowns exist about how AI will perform in live clinical practice, the evidence also shows its significant promise and possible benefits. AI’s ability to correctly and quickly interpret images has the potential to improve workflows and free up clinicians' time to focus on more complex images, while improving the timeliness of diagnoses (providing it continues to guard against false positives).
However, there are challenges around the pace of implementation, and a lack of patient voice makes it difficult to solve the practical, legal and ethical implications of AI use in clinical diagnostics.
When policy-makers are keen to extol the many benefits of AI, it is difficult to be clear on the specific benefits in practice, given that the evidence is still developing. So perhaps the more salient question to ask is “what specific problem would you like AI to solve”?
Asking this should sharpen the focus on developing problem-driven AI tools, which have specific health care challenges in mind, and which can generate evidence in live clinical practice. This would help balance the push for implementation with the need for stronger evidence that many clinicians are crying out for, but we just don’t have yet.
*For more details on our systematic review findings, read our paper published in The Lancet eClinicalMedicine. You can find more information on RSET’s full evaluation of AI in radiology diagnostics on our project page.
Notes
We reviewed a total of 140 relevant studies published between 1 January 2020 and 31 January 2025. Additional inclusion criteria as follows:
- Quantitative studies needed to evaluate AI as a support tool for human decision-making, rather than being used in isolation; in line with current guidance that AI should be used with human supervision. Empirical studies (covering implementation, experiences, perceptions, quantitative or cost outcomes).
- Those that focused on AI being used to support diagnostics in radiology (algorithmic use for image interpretation and decision-making, not image generation).
- Studies covering UK-based and international evidence and written in English.
Suggested citation
Dodsworth E and Lawrence R (2025) “Curb your enthusiasm: what does the evidence tell us about using AI in radiology diagnostics?”, RSET blog