Endoscopy 2023; 55(04): 342-343
DOI: 10.1055/a-1986-7532
Editorial

Artificial intelligence in endoscopic assessment of ulcerative colitis: virtual painting with PICaSSO

Referring to Iacucci M et al. p. 332–341
Silvio Danese
1   Gastroenterology and Endoscopy Department, IRCCS Ospedale San Raffaele, Milan, Italy
2   University Vita-Salute San Raffaele, Milan, Italy
› Author Affiliations

Interobserver variability limits the precise assessment of endoscopic activity in patients with inflammatory bowel disease (IBD). Therefore, there is increasing interest in computer models to overcome the variability between endoscopists. In the current issue of Endoscopy, Iacucci and the PICaSSO group present the “new kid on the block” of artificial intelligence (AI) systems for the evaluation of inflammatory activity in ulcerative colitis (UC) [1].

“Iacucci and colleagues importantly focused on differentiating mild disease from remission, arguably the most challenging clinical assessment, with important clinical implications.”

The authors developed two separate computer models, one based on white-light endoscopy (WLE) and one on virtual chromoendoscopy (VCE), and then compared their performances against the human “gold standard”. The computer system assessed proper endoscopy videos, not frames, consolidating a shift towards practical clinical use that has already been seen in other studies [2]. The dataset used for training of the algorithms was large and, more importantly, was collected in multiple centers. This heterogeneity is healthy and reassuring for the applicability of the model in different settings, in other words, limiting the risk of model overfitting.

xThe main novelty of the study is the use virtual chromoendoscopy, particularly Pentax’s platform. Building on the PICaSSO multicenter study that showed how VCE improved the assessment of inflammation, the authors developed the two AIs (WLE-AI and VCE-AI) and tested them for their ability to distinguish endoscopic remission from activity. The VCE-AI showed higher diagnostic performance, with sensitivity in excess of 80 % and specificity over 90 %, while the WLE-AI was slightly lower; however, because the definitions and cutoffs of endoscopic activity/remission varied for WLE (UCEIS > 1) and VCE (PICaSSO > 3), a direct comparison of the models is not possible. When the analysis was restricted to high quality videos, a modest improvement in performance was shown, mainly owing to increased sensitivity, supporting the stability of the model overall, with little interference from lower quality data.

As a secondary outcome, the study explored the ability of the models to predict the underlying histological activity. The two systems showed diagnostic accuracies of around 80 %, with minimal differences depending on the score used. Again, the study was not designed to compare the two AI systems (WLE and VCE), but the use of the same histological cutoffs allows for some speculation. Because the two AIs had similar performances, the authors hypothesize that the advantage of VCE and, therefore of VCE-AI, may be a peculiarity of the human eye and brain, and therefore bypassed by the computers. This fascinating hypothesis will need further confirmation in future studies, but suggests AI could overcome at least some of the limitations of human assessment.

Finally, having followed the participants for 12 months, the authors evaluated the prognostic ability of the AI systems to predict flares. They classified patients in endoscopic remission or activity, using WLE and VCE, based on human and AI assessments, and plotted the respective Kaplan–Meier curves for the risk of flare. The curves show clear separation, supporting the importance of endoscopic inflammation as a predictor of flare, and the hazard ratios were comparable, although not statistically tested, between the respective couples (AI and endoscopist), again supporting the validity of the system to assess the risk of flare.

Despite the important progress and exciting performance, such models, including others previously reported, fall short of grading the full spectrum of UC severity, ranging from remission to severe inflammation. Iacucci and colleagues importantly focused on differentiating mild disease from remission, arguably the most challenging clinical assessment, with important clinical implications. Widespread adoption will however likely require progression from a dichotomous classification to the further differentiation of all grades of inflammation. Beyond everyday clinical practice, models like these could soon replace the inefficient process of central reading for clinical trials, by providing a quick and inexpensive arbitrator.

Interest in AI for endoscopy in IBD is well placed, and exciting technological novelties are about to arrive. Nevertheless, thus far, studies have focused on UC, while Crohn’s disease (CD) has remained neglected. We hope that the difficulties in evaluating CD activity will soon be overcome. Endoscopy in CD remains challenging, with a major need for standardization that should be prioritized.

Overall, the work by Iacucci and colleagues consolidates the experience of AI-based systems in IBD endoscopy, also expanding it to VCE. Promising results in disease characterization point to a bright future for deeper disease understanding of UC, along with better outcome prediction.



Publication History

Article published online:
11 January 2023

© 2023. Thieme. All rights reserved.

Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany

 
  • References

  • 1 Iacucci M, Cannatelli R, Parigi TL. et al. A virtual chromoendoscopy artificial intelligence system to detect endoscopic and histologic activity/remission and predict clinical outcomes in ulcerative colitis. Endoscopy 2023; 55: 332-341 DOI: 10.1055/a-1960-3645.
  • 2 Takenaka K, Fujii T, Kawamoto A. et al. Deep neural network for video colonoscopy of ulcerative colitis: a cross-sectional study. Lancet Gastroenterol Hepatol 2022; 7: 230-237