Endoscopy 2021; 53(03): 285-287
DOI: 10.1055/a-1367-1979
Editorial

Artificial intelligence for colonoscopy: the new Silk Road

Referring to Barua I et al. p. 277–284
Alessandro Repici
1   Department of Biomedical Sciences, Humanitas University, Milan, Italy
3   IRCCS Humanitas Research Hospital, Milan, Italy
,
Cesare Hassan
2   Endoscopy Unit, Nuovo Regina Margherita Hospital, Rome, Italy
› Author Affiliations

It all began with a few hybrid information technology–medical studies claiming that a new type of software could recognize endoscopic lesions on offline images [1]. The use of the term “artificial intelligence” (AI) underlined how the learning process was driven by automatic extraction of main features from annotated images rather than by the application of human judgment [2].

“It could be argued that most of the increase in ADR by AI was driven by additional detection of adenomas < 5 mm in size rather than larger and potentially more advanced lesions. However this is unfair because most of the ADR is represented by diminutive lesions, thus any ADR increase will necessarily be driven by an increased detection of such lesions.”

The initial excitement was mirrored by a contrary skepticism. How often had similar claims for a variety of devices or techniques ended in disappointment? We had sadly become familiar with part of the Gartner Hype Cycle for emerging technologies: an initial boost of good results ultimately reversed by negative studies. The limitations of AI certainly could not be ignored. First, the AI software was validated only against human ground truth as a reference standard in artificial studies, indicating that in the best scenario AI was equivalent to experts but not superior. Secondly, most of these AI systems were validated offline, generating uncertainty about their feasibility in a real-time setting [3]. In addition, the first prototypes for real-time application required two parallel monitors, one for the endoscope and one for the counterpart AI images, because of an unavoidable delay as the signal transited through the AI system [4] [5]. Fourthly, there was the fear of pointless waste of time as some systems presented extremely high rates of false-positive results, in up to 20 % of the entire videos. Fifthly, there was extreme heterogeneity among the AI systems, regarding training dataset, validation procedures, and the architecture of the AI algorithm. Finally, the new AI jargon, with several terms, such as “training,” “cross-validation,” and “testing,” being used in an information technology sense that was unfamiliar to both academic and community endoscopists [6].

Such skepticism, however, could not countervail our desperate need for AI in screening colonoscopy. Despite the undeniable increases in adenoma detection rates (ADRs) due to quality assurance, we still miss one in every four neoplastic lesions [7] [8]. This may be related to distraction or fatigue, especially after several hours of activity, as well as to suboptimal competence in recognition. The latter is likely to be the case regarding flat advanced lesions, such as nongranular lateral spreading tumors (NG-LSTs). In turn, missing of lesions has been estimated to be the predominant factor contributing to post-colonoscopy colorectal cancer rates, with an incidence as high as 1 % over 10 years [7] [9]. In addition, an embarrassing variability in ADRs across any series of endoscopists raises questions about the status of screening colonoscopy as a clinical standard. Of note, relatively, low-detectors can miss up to 75 % of neoplastic lesions as compared with high-detectors [7]. Unsurprisingly, similarly high values of missed diagnosis have been reported for Barrett-related dysplasia or early gastric cancer, indicating a general reluctance of some endoscopists to recognize their own underperformance [10] [11].

Dum Romae consulitur, Saguntum expugnatur! [Whilst they deliberated in Rome, Saguntum was captured!] While the pros and cons of AI were being compared, science was suddenly overcome by technology. All the major players in the field of endoscopy upgraded their AI systems with graphic interfaces able to display colorectal lesions in a real-time setting, superimposing a clearly visible mark on any suspected lesions with high degrees of accuracy. These systems were released immediately in the European market as regulatory clearance was primarily based on artificial studies. While there had not been enough time to reply to the initial question “Should we implement AI?,” the next question suddenly became: “What are the effects of AI on screening colonoscopy?”

The main uncertainty was about the interaction between human endoscopist and AI system in a clinical setting. What if endoscopists relied excessively on the system, to the detriment of their own performance? What if any false-positive were to lead to a pointless resection? What about the additional withdrawal time, the costs, and the convenience of AI systems? What we needed was one or more randomized clinical trials (RCTs) that benchmarked the outcome of AI against a control group. While results were anxiously awaited from the most well-known Western centers, somewhat unexpectedly the first RCT came from a Chinese center that, we must admit, was fairly unknown to the colonoscopy community in the pre-AI era [4]. Nevertheless, “one swallow doesn't make a summer!” and the AI system used in that trial was not available in Western countries. Secondly, the mean ADR in the control group was much less than those currently reported in Western settings. In addition it was a single-center study, raising doubts about the generalizability of the data. Even more unexpectedly, the second and then the third study came again from different Chinese centers, with different AI systems and different methodology, but with a similar risk of bias [12]. The center of gravity had definitively moved from West to East, but the final implications were still unclear.

A “big bang” was needed, and this explains the importance of the pivotal meta-analysis published by Barua et al. [13]. This offers a synoptic snapshot of the series of five consecutive Chinese RCTs on AI in screening colonoscopy: unequivocally homogeneous AI-related increases of 52 % in ADR and 78 % in mean adenoma per colonoscopy (APC) rate, respectively, were shown in more than 4000 randomized patients [13]. Both the magnitude of the AI benefit and the intertrial consistency were far from unexpected. More than plausible, it was almost inevitable that any endoscopist was advantaged when moving from the detection of a subtle flat neoplasia to that of a large colored rectangle ([Fig. 1]) surrounding such a lesion. In addition, no harmful effects in terms of pointless resections for non-neoplastic lesions or on withdrawal time emerged from the meta-analysis; this was reassuring regarding the competence of the endoscopists in discarding false-positive results. It could be argued that there is no certainty that the additional detection was driven by AI. However, when considering the average 90 % or more sensitivity of these AI systems, there is no plausible reason why this value should not be applied to the additional detection found in these studies. It could also be argued that most of this increase was driven by additional detection of adenomas < 5 mm in size, rather than by detection of larger and potentially more advanced lesions. However, this is somewhat unfair. First, most of the ADR comprises diminutive lesions; thus any ADR increase will necessarily be driven by an increased detection of such lesions. Secondly, ADR is a technical indicator that is related to a clinical outcome, namely the risk of post-colonoscopy colorectal cancer. Thus, the actual reason for ADR increase is merely technically rather than clinically relevant. Third, even the sample size of this meta-analysis is clearly underpowered, when considering the very low prevalence of larger or advanced neoplasia in the relatively young Chinese population.

Zoom Image
Fig. 1 Detection by artificial intelligence (Gi-Genius, Medtronic) of a colorectal flat neoplasia. The boundaries of the green rectangles are much more visible than the margins of the sessile serrated lesion, corroborating the increase in detection shown by the meta-analysis of Barua et al. [13].

The East called, the West responded! In early 2020, our group finally published the first western multicenter RCT on AI in screening colonoscopy, showing a 30 % and 46 % increase in ADR and APC, respectively, with no significant change in withdrawal time or non-neoplastic resection rate [14]. Despite a much higher ADR in our control group compared to that of the meta-analysis, our data fully confirmed the estimates of Barua et al., supporting the benefit of AI in screening colonoscopy. The gap between East and West was finally closed, while opening the new Silk Road of AI for a more global and inclusive scientific partnership!

Correction

Artificial intelligence for colonoscopy: the new Silk Road
Repici A, Hassan C Endoscopy 2020, 53: 285–287.
In the above-mentioned article, the institution affiliation 1 has been corrected and the institution affiliation 3 was added. This was corrected in the online version on April 16, 2021.



Publication History

Article published online:
25 February 2021

© 2021. Thieme. All rights reserved.

Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany

 
  • References

  • 1 Misawa M, Kudo S-E, Mori Y. et al. Artificial intelligence-assisted polyp detection for colonoscopy: Initial experience. Gastroenterology 2018; 154: 2027-2029.e3
  • 2 Byrne M. Artificial intelligence in gastroenterology. Techniques Innovations Gastrointest Endosc 2019; 22: 41-90
  • 3 Lui TKL, Guo C-G, Leung WK. Accuracy of artificial intelligence on histology prediction and detection of colorectal polyps: a systematic review and meta-analysis. Gastrointest Endosc 2020; 92: 11-22.e6
  • 4 Wang P, Berzin TM, Glissen BrownJR. et al. Real-time automatic detection system increases colonoscopic polyp and adenoma detection rates: a prospective randomised controlled study. Gut 2019; 68: 1813-1819
  • 5 Wang P, Liu X, Berzin TM. et al. Effect of a deep-learning computer-aided detection system on adenoma detection during colonoscopy (CADe-DB trial): a double-blind randomised study. Lancet Gastroenterol Hepatol 2020; 5: 343-351
  • 6 Hassan C, Spadaccini M, Iannone A. et al. Performance of artificial intelligence for colonoscopy regarding adenoma and polyp detection: a meta-analysis. Gastrointest Endosc 2021; 93: 77-85
  • 7 Zhao S, Wang S, Pan P. et al. Magnitude, risk factors, and factors associated with adenoma miss rate of tandem colonoscopy: a systematic review and meta-analysis. Gastroenterology 2019; 156: 1661-1674.e11
  • 8 Kaminski MF, Thomas-Gibson S, Bugajski M. et al. Performance measures for lower gastrointestinal endoscopy: a European Society of Gastrointestinal Endoscopy (ESGE) Quality Improvement Initiative. Endoscopy 2017; 49: 378-397
  • 9 Rex D, Cutler C, Lemmel G. et al. Colonoscopic miss rates of adenomas determined by back-to-back colonoscopies. Gastroenterology 1997; 112: 24-28
  • 10 Pimenta-Melo AR, Monteiro-Soares M, Libânio D. et al. Missing rate for gastric cancer during upper gastrointestinal endoscopy: a systematic review and meta-analysis. Eur J Gastroenterol Hepatol 2016; 28: 1041-1049
  • 11 Rodríguez de Santiago E, Hernanz N, Marcos-Prieto HM. et al. Rate of missed oesophageal cancer at routine endoscopy and survival outcomes: A multicentric cohort study. United Eur Gastroenterol J 2019; 7: 189-198
  • 12 Liu W-N, Zhang Y-Y, Bian X-Q. et al. Study on detection rate of polyps and adenomas in artificial-intelligence-aided colonoscopy. Saudi J Gastroenterol 2020; 26: 13
  • 13 Barua I, Vinsard D, Jodal H. et al. Artificial intelligence for polyp detection during colonoscopy: a systematic review and meta-analysis. Endoscopy 2020; 53: 277-284
  • 14 Repici A, Badalamenti M, Maselli R. et al. Efficacy of real-time computer-aided detection of colorectal neoplasia in a randomized trial. Gastroenterology 2020; 159: 512-520