AI in Disease Detection


The opportunities of using AI for early disease detection

David Crosby

David Crosby is Head of Prevention and Early Detection Research at Cancer Research UK (CRUK). Previously, at the Medical Research Council, he oversaw various science areas and research funding programmes (including inflammation, cardiovascular and respiratory research). He is now developing and implementing a new strategy and programme of research investments at CRUK which aims to accelerate progress towards earlier detection and prevention of cancer, through an integrated multidisciplinary approach, driven by equitable improvements in health outcomes.


  • AI is a pattern recognition engine with many potential applications in early disease detection
  • The hope is that AI can be trained to eliminate variability and subjectivity
  • AI can potentially spot more complex, subtle and multifactorial patterns than a human can
  • AI may be able to successfully integrate multiple streams of data into an integrated assessment
  • There are important ethical issues in this subject.

The early detection of disease depends on three key issues. The first is who you test, which is a matter generally of understanding risk. The second is how do you test: that is a matter of technology. Then the third concerns what is one testing for and the nature of what has actually been found.

AI is set to have a major impact on all three of these domains, as it will in virtually every aspect of human life. What does it mean for the early detection of cancer? Aspects of scientific research and clinical care delivery are already being impacted by AI.

There is a great deal of evidence to show that AI and machine learning can be trained to replicate the function of, for example, a pathologist. When a biopsy of a potential tumour is taken today, a pathologist looks at a stained slide and makes a judgment about whether this is cancer or not. That judgment is based on pattern recognition: are they witnessing patterns of unusual cellular shape and behaviour. AI is, of course, a pattern recognition engine as is the human brain.

The machine can, in fact, be trained to essentially replicate the judgments that humans make. The problem with training the machine against human judgments as the gold standard is that it can then only be as good as the human judgments. The problem with human judgments, though, is that they are inherently flawed too. They can only perceive what they can perceive and so there is always the potential for small things to be missed or misinterpreted.

There is also an issue of reproducibility between individuals: a different pathologist may look at the same image and give a different judgment to her colleague. Indeed, there is evidence in the literature of variability within individuals: a given radiologist at different moments of the day could make different judgments from the same scan.

The hope, then, is that machines will not just replicate human performance, but actually transcend it and eliminate variability and subjectivity. That applies to all image-based recognition areas whether reviewing an MRI scan, looking at pathology slides, or looking at images from an inside your lungs or colon. AI is already showing the potential to match human performance in these areas – and the hope is that it can exceed it.

Other technological advances will synergise with AI. For example, with early detection of colorectal cancer, the first test today is of the faeces to see if there is blood. If so, the patient is referred for a colonoscopy. A camera on a tube is inserted into the body and the data is interpreted visually by a human, who may or may not spot anything suspicious.

There are now technologies such as capsule endoscopy. The patient swallows a miniaturised camera in a pellet, that then passes down through the colon conducting continual video surveillance as it passes through. AI can be used to review hours of footage that would be extremely time-consuming for a human to carry out. A more complete picture then emerges with a combination of technologies.

Blood tests

Another advance is the multi-cancer, early detection blood test (MCED). Tumours are unstable, they break down and release their contents into the bloodstream. That tumour DNA is subtly different from normal, healthy DNA. But the fragments are so small and the concentrations so low that they are quite difficult to spot. Now, though, sequencing technology has advanced to the stage where this looks like a feasible method of early detection. And the advantage of blood tests is the ability to test for a number of different cancer types at the same time. In a given individual, one can be looking for lung cancer, brain cancer and colon cancer all at the same time.

There is a very large trial of one such technology happening in the NHS right now. That technology has been evolving over the past 10 years or so. It originally came from looking at foetal genomic aberrations. The traditional test for Downs syndrome is ultrasound followed by an amniocentesis. That has now been superseded, because fragments of foetal DNA float around the bloodstream.

So we have proof of principle but every cancer is different and every mutation is different. There are hundreds of thousands of permutations. Looking across all those different variations would be impossible for a human. But AI can search for those patterns – these MCED cancer blood tests are now feasible because machine learning has been employed to detect the optimal complex biomolecular signatures to search for and which may tell us not just that cancer is present, but where in the body it may be. That is a clear example where AI has jumped beyond what humans are capable of.


If it was known who was at elevated risk of developing cancer, or indeed any other disease, then we could be potentially much better at early detection. Any given disease is relatively rare in the general population. And tests are imperfect. If you tested everybody for cancer every year, there would be vastly more false positives than true positives. Now that is a problem with any testing or detection strategy.

One way of improving the detection rate is by only testing people who are at elevated risk. But how do we know who is at elevated risk? There are many areas of research where people are looking at different types of risk factor. Some people are interested in genomic risk of cancer, the genes you were born with which put you at different levels of risk. There are people who are interested in socio-economic determinants of health, which includes the environment you are born into, the pollution you grew up in and are exposed to day-by-day etc.. Then there are people who are interested in behavioural risk: e.g. diet, exercise, tobacco smoke – all risk factors.

Now, AI might enable us to integrate all of these. That is a very complex proposition because it involves thousands of variables. Yet that is the hope: that AI will make a major impact in integrating multimodal data to assess who is at risk and who should be tested in the first place.

So those are the three main areas where we will see significant impacts. One of the caveats, though, is that just finding something is not the end of the story. A great deal of disease can be inconsequential. Many prostate cancers are inconsequential, for example. They are growing so slowly that the individual will die of something else long before prostate cancer would do any harm.

Equally, to build a machine that detected everything would generate a huge treatment burden on the NHS, which may or may not actually have any impact on anyone's quality of life or longevity.

Among other big caveats are stress and anxiety and their psychological ramifications. We have to think very carefully about what people want to know about risk and whether they have incipient disease, and whether we can really help them to lower that risk or prevent that disease. Those are important ethical considerations as we think about the future of AI in early detection.