Endoscopy 2025; 57(08): 947
DOI: 10.1055/a-2595-0174
Letter to the editor

Reflections on artificial intelligence for submucosal vessel detection during third-space endoscopy

1   Department of Artificial Intelligence, Asian Institute of Gastroenterology, Hyderabad, India (Ringgold ID: RIN78470)
,
2   Department of Gastroenterology, Asian Institute of Gastroenterology, Hyderabad, India (Ringgold ID: RIN78470)
,
Rakesh Kalpala
2   Department of Gastroenterology, Asian Institute of Gastroenterology, Hyderabad, India (Ringgold ID: RIN78470)
,
Mohan Ramchandani
2   Department of Gastroenterology, Asian Institute of Gastroenterology, Hyderabad, India (Ringgold ID: RIN78470)
,
D. Nageshwar Reddy
2   Department of Gastroenterology, Asian Institute of Gastroenterology, Hyderabad, India (Ringgold ID: RIN78470)
› Author Affiliations
Preview

We read with great interest the article by Scheppach et al., “Use of artificial intelligence for submucosal vessel detection during third-space endoscopy” [1]. The authors present valuable work exploring the potential for artificial intelligence (AI) to enhance safety during complex endoscopic procedures. We wish to offer some reflections on the study’s methodology and its implications for clinical translation.

First, the test dataset composition (vessels present in 90% of short clips, mean clip duration 29.5 seconds) differs significantly from varied clinical practice. This curated environment might induce an expectation bias and does not fully replicate the cognitive load and multi-tasking attentional demands endoscopists face during lengthy procedures.

Second, the sequential testing design, where non-AI clips are evaluated before AI-supported clips, introduces a potential order effect. Participants, particularly trainees with limited prior observation (~10 procedures), may show a learning curve, potentially inflating the perceived AI benefit. An alternating or randomized presentation would mitigate this potential bias.

Third, while the reported mean intersection over union (IoU) of ~64%–67% indicates overlap, its clinical adequacy requires careful consideration. Incomplete vessel delineation, particularly at bifurcations, could lead to inadvertent transection despite the AI achieving a “successful” detection score based on the IoU threshold.

Finally, the modest labeled training dataset (5470 images) and imbalance with 179 681 unlabeled images may impact model robustness and generalizability. For safety-critical clinical AI, discussion on appropriate significance thresholds and dataset requirements is crucial. The reliance on retrospective video clips, while practical for initial algorithm development, limits the assessment of AI performance under real-world clinical conditions such as operator fatigue or variable procedural complexity. These factors are better captured in prospective randomized controlled trials.

We commend the authors’ contribution to this important field and believe addressing these points in prospective trials will be vital for validating this technology for clinical use.



Publication History

Article published online:
29 July 2025

© 2025. Thieme. All rights reserved.

Georg Thieme Verlag KG
Oswald-Hesse-Straße 50, 70469 Stuttgart, Germany