Open Access
CC BY 4.0 · Endosc Int Open 2025; 13: a26952841
DOI: 10.1055/a-2695-2841
Letter to the editor

Urgency for standardized protocols to improve clinical implementation of artificial intelligence in endoscopic diagnostics

Authors

  • Ulrik Deding

    1   Department of Clinical Research, University of Southern Denmark, Odense, Denmark (Ringgold ID: RIN6174)
    2   Department of Surgery, Odense University Hospital, Svendborg, Denmark (Ringgold ID: RIN11286)
  • Benedicte Schelde-Olesen

    3   Surgical Research Unit, Odense University Hospital, Odense, Denmark (Ringgold ID: RIN11286)
  • Ervin Toth

    4   Endoscopy Unit, Department of Gastroenterology, Skane University Hospital, Malmo, Sweden
  • Anastasios Koulaouzidis

    5   Department of Social Medicine & Public Health, Pomeranian Medical University in Szczecin, Szczecin, Poland (Ringgold ID: RIN37805)
  • on behalf of the AICE consortium

Funded by the European Union. Views and opinions expressed are, however, those of the author(s) only and do not necessarily reflect those of the European Union or the European Commission. Neither the European Union nor the European Commission can be held responsible for them.
Preview

The recent systematic review by Cold et al. [1] offers a comprehensive review of artificial intelligence (AI) tools designed for preparation scoring in colonoscopy. Despite confirming high performance, the review highlights significant limitations such as inconsistent validation methods, a lack of external reproducibility, minimal integration into clinical workflows, and suboptimal reference standards. Most importantly, correlation with clinically meaningful endpoints such as adenoma detection rate (ADR) or adenoma miss rate (AMR) is scarce, raising concerns about practical impact on patient outcomes. This is particularly problematic when the benchmark itself (colonoscopy) has a mean AMR of 26%, raising serious concerns about its suitability as a reference standard for validating AI systems, especially when those systems are applied to other modalities such as capsule endoscopy (CE), where the diagnostic context is fundamentally different.

Although interobserver agreement using the Boston Bowel Preparation Score (BBPS) may be high, it holds little clinical value if not linked to hard outcomes. Moreover, training AI on only high-consensus images introduces bias, making models less effective in ambiguous or difficult cases. Although it streamlines annotation, it biases AI toward clear-cut cases, reducing performance in diagnostically challenging images. Clinical-grade AI must be trained on both ambiguous and consensus cases to ensure balanced diagnostic capability.

Nadimi et al. [2] address these gaps with an explainable AI (XAI) model integrated into the colon CE (CCE) workflow, automating polyp detection, characterization, and sizing. Using methods such as GradCAM++, CartoonX, and Pixel RDE, the system combines technical strength with transparency to support clinical trust and adoption. Their work highlights the value of combining AI tools to achieve more precise and detailed results, aligning with the discussion by Cold et al. In parallel, Mascarenhas-Saraiva et al. [3] developed a rule-based AI for bowel prep assessment in CCE, prioritizing interpretability and real-time feedback. Designed for clinical use, it aims to enhance trust and reduce fatigue or variation in high-volume endoscopy settings.

Most current AI models in capsule endoscopy ignore key clinical cues such as frame and pixel clustering and time spent on specific regions. This lack of temporal modeling limits diagnostic realism and disconnects AI behavior from clinician reasoning. Although less critical in live colonoscopy, it is essential in asynchronous CCE. In addition, Moen et al. [4] outlined critical challenges and future directions for AI in CCE. Their review emphasizes the need for harmonized datasets, transparent AI models, and coordinated clinical validation. These systemic challenges support the rationale for a unified framework such as The Artificial Intelligence in Capsule Endoscopy (AICE) project to ensure alignment between technological development and clinical implementation.

AICE was funded by EU to address precisely these issues. AICE is a European collaboration committed to building a cohesive and clinically driven framework for AI implementation in CCE. It aims to standardize evaluation protocols, support multicenter validation efforts, and advocate for alignment with established international standards, including those of the European Society of Gastrointestinal Endoscopy and the World Endoscopy Organization. By fostering collaboration between clinicians, data scientists, and regulatory stakeholders, AICE seeks to transform AI from a set of isolated tools into an integrated component of endoscopic care. AICE also prioritizes explainability and usability as core pillars of responsible AI. The initiative supports use of interpretable models, transparent reporting guidelines, and validation in real-world settings. This approach ensures that end-users, especially clinicians, can trust and effectively interact with AI outputs. It also lays the foundation for regulatory approval and integration into national screening programs, particularly as health systems become more digitally mature.

However, despite notable individual advances, the broader field of AI in endoscopy remains fragmented. There is still limited consensus on validation protocols, outcome measures, reference standards, and implementation standards [5]. It looks as if the momentum of innovation has outpaced the frameworks for adoption. What is needed now is alignment on reporting, external validation across populations, and endpoints that reflect real patient outcomes, including diagnostic uncertainty, not just retrospective accuracy.



Publication History

Received: 01 August 2025

Accepted: 02 September 2025

Article published online:
24 September 2025

© 2025. The Author(s). This is an open access article published by Thieme under the terms of the Creative Commons Attribution License, permitting unrestricted use, distribution, and reproduction so long as the original work is properly cited. (https://creativecommons.org/licenses/by/4.0/).

Georg Thieme Verlag KG
Oswald-Hesse-Straße 50, 70469 Stuttgart, Germany

Bibliographical Record
Ulrik Deding, Benedicte Schelde-Olesen, Ervin Toth, Anastasios Koulaouzidis. Urgency for standardized protocols to improve clinical implementation of artificial intelligence in endoscopic diagnostics. Endosc Int Open 2025; 13: a26952841.
DOI: 10.1055/a-2695-2841
 
  • References

  • 1 Cold KM, Ali A, Konge L. et al. Bowel preparation assessment using artificial intelligence: Systematic review. Endosc Int Open 2025; 13: a26256327
  • 2 Nadimi ES, Braun JM, Schelde-Olesen B. et al. Towards full integration of explainable artificial intelligence in colon capsule endoscopy's pathway. Sci Rep 2025; 15: 5960
  • 3 Mascarenhas Saraiva MJ, Afonso J, Ribeiro T. et al. AI-driven colon cleansing evaluation in capsule endoscopy: A deep learning approach. Diagnostics (Basel) 2023; 13: 3494
  • 4 Moen S, Vuik FER, Kuipers EJ. et al. Artificial intelligence in colon capsule endoscopy-A systematic review. Diagnostics (Basel) 2022; 12: 1994
  • 5 Lei II, Koulaouzidis A, Schelde-Olesen B. et al. Unifying terminology, reporting, and bowel preparation standards in colon capsule endoscopy: Nyborg Consensus. Endosc Int Open 2025; 13: a24955427