Subscribe to RSS
DOI: 10.1055/a-2650-0789
Multicenter validation of a cholangioscopy artificial intelligence system for the evaluation of biliary tract disease
Supported by: UMass Memorial Health
Supported by: University of Massachusetts Medical School
Supported by: MassVentures
Supported by: Mayo Clinic

Abstract
Background
Clinicians struggle to accurately classify biliary strictures as benign or malignant. Current endoscopic retrograde cholangiopancreatography (ERCP)-based sampling modalities including brush cytology and forceps biopsy have poor sensitivity for pathologic confirmation of malignancy. Cholangioscopy allows for direct visualization and sampling of biliary pathology; however, this technology is also associated with inaccurate classification of biliary disease. Previously, an artificial intelligence (AI) system that analyzes cholangioscopy footage was found to be more accurate in diagnosing biliary malignancy than ERCP sampling techniques. The aim of this study was to validate this AI system on a new series of examinations.
Method
Three academic centers collected all available unedited cholangioscopy recordings. The videos were processed by the cholangioscopy AI system. After analyzing videos, the AI system provided predictions as to whether malignancy was present. AI performance in classifying strictures was compared with the performance of brush cytology and forceps biopsy.
Results
112 cholangioscopy examinations (containing 4 817 081 images) were generated from 99 patients. Of those examinations, 61 (54.5%) were for investigation of biliary strictures (31 [50.8%] benign, 30 [49.2%] malignant). For the correct classification of strictures, the AI system was 80.0% sensitive and 90.3% specific. It was also significantly more accurate for stricture classification (85.2%) than brush cytology (52.5%; P<0.001), forceps biopsy (68.2%; P=0.04), and the combination of brush cytology and forceps biopsy (66.7%; P=0.02).
Conclusion
A previously developed cholangioscopy AI system was found to continually outperform standard ERCP sampling modalities for accurate identification of malignancy, without additional retraining, in a multicenter validation cohort.
Publication History
Received: 18 February 2025
Accepted after revision: 06 July 2025
Accepted Manuscript online:
06 July 2025
Article published online:
18 August 2025
© 2025. Thieme. All rights reserved.
Georg Thieme Verlag KG
Oswald-Hesse-Straße 50, 70469 Stuttgart, Germany
-
References
- 1 Banales JM, Marin JJG, Lamarca A. et al. Cholangiocarcinoma 2020: the next horizon in mechanisms and management. Nat Rev Gastroenterol Hepatol 2020; 17: 557-588
- 2 Castro FA, Koshiol J, Hsing AW. et al. Biliary tract cancer incidence in the United States-Demographic and temporal variations by anatomic site. Int J Cancer 2013; 133: 1664-1671
- 3 Fujii-Lau LL, Thosani NC, Al-Haddad M. et al. American Society for Gastrointestinal Endoscopy guideline on the role of endoscopy in the diagnosis of malignancy in biliary strictures of undetermined etiology: summary and recommendations. Gastrointest Endosc 2023; 98: 685-693
- 4 Navaneethan U, Njei B, Lourdusamy V. et al. Comparative effectiveness of biliary brush cytology and intraductal biopsy for detection of malignant biliary strictures: a systematic review and meta-analysis. Gastrointest Endosc 2015; 81: 168-176
- 5 Baroud S, Sahakian AJ, Sawas T. et al. Impact of trimodality sampling on detection of malignant biliary strictures compared with patients with primary sclerosing cholangitis. Gastrointest Endosc 2022; 95: 884-892
- 6 Hirasawa T, Aoyama K, Tanimoto T. et al. Application of artificial intelligence using a convolutional neural network for detecting gastric cancer in endoscopic images. Gastric Cancer 2018; 21: 653-660
- 7 Marya NB, Hartley C, Powers PD. et al. Development of a computer-aided prediction tool for evaluating brushing samples of biliary strictures. Clin Gastroenterol Hepatol 2024; 22: 185-187 e3
- 8 Marya NB, Powers PD, Chari ST. et al. Utilisation of artificial intelligence for the development of an EUS-convolutional neural network model trained to enhance the diagnosis of autoimmune pancreatitis. Gut 2021; 70: 1335-1344
- 9 Marya NB, Powers PD, Fujii-Lau L. et al. Application of artificial intelligence using a novel EUS-based convolutional neural network model to identify and distinguish benign and malignant hepatic masses. Gastrointest Endosc 2021; 93: 1121-1130 e1
- 10 Karagyozov P, Boeva I, Tishkov I. Role of digital single-operator cholangioscopy in the diagnosis and treatment of biliary disorders. World J Gastrointest Endosc 2019; 11: 31-40
- 11 Stassen PMC, Goodchild G, de Jonge PJF. et al. Diagnostic accuracy and interobserver agreement of digital single-operator cholangioscopy for indeterminate biliary strictures. Gastrointest Endosc 2021; 94: 1059-1068
- 12 Marya NB, Powers PD, Petersen BT. et al. Identification of patients with malignant biliary strictures using a cholangioscopy-based deep learning artificial intelligence (with video). Gastrointest Endosc 2023; 97: 268-278 e1
- 13 U.S. Census Bureau. Quick Facts, United States; San Bernardino County, California; Worcester County, Massachusetts; Olmsted County, Minnesota. U.S. Department of Commerce 2024. Accessed June 16, 2025 at: https://www.census.gov/quickfacts/fact/table/US,sanbernardinocountycalifornia,worcestercountymassachusetts,olmstedcountyminnesota/PST045224
- 14 Gulamhusein AF, Sanchez W. Liver transplantation in the management of perihilar cholangiocarcinoma. Hepat Oncol 2015; 2: 409-421
- 15 Mercaldo ND, Lau KF, Zhou XH. Confidence intervals for predictive values with an emphasis to case-control studies. Stat Med 2007; 26: 2170-2183
- 16 Kaura K, Sawas T, Bazerbachi F. et al. Cholangioscopy biopsies improve detection of cholangiocarcinoma when combined with cytology and FISH, but not in patients with PSC. Dig Dis Sci 2020; 65: 1471-1478
- 17 Singhi AD, Nikiforova MN, Chennat J. et al. Integrating next-generation sequencing to endoscopic retrograde cholangiopancreatography (ERCP)-obtained biliary specimens improves the detection and management of patients with malignant bile duct strictures. Gut 2020; 69: 52-61
- 18 Njei B, McCarty TR, Varadarajulu S. et al. Cost utility of ERCP-based modalities for the diagnosis of cholangiocarcinoma in primary sclerosing cholangitis. Gastrointest Endosc 2017; 85: 773-781 e10
- 19 Robles-Medranda C, Baquerizo-Burgos J, Alcivar-Vasquez J. et al. Artificial intelligence for diagnosing neoplasia on digital cholangioscopy: development and multicenter validation of a convolutional neural network model. Endoscopy 2023; 55: 719-727
- 20 Kahaleh M, Gaidhane M, Shahid HM. et al. Digital single-operator cholangioscopy interobserver study using a new classification: the Mendoza Classification (with video). Gastrointest Endosc 2022; 95: 319-326
- 21 Samarasena J, Yang D, Berzin TM. AGA Clinical Practice Update on the role of artificial intelligence in colon polyp diagnosis and management: commentary. Gastroenterology 2023; 165: 1568-1573
- 22 Murrugarra-Llerena J, Kirsten L, Jung C. Can we trust bounding box annotations for object detection? In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 19–20 June 2022; 2022..
- 23 Zhang X, Tang D, Zhou JD. et al. A real-time interpretable artificial intelligence model for the cholangioscopic diagnosis of malignant biliary stricture (with videos). Gastrointest Endosc 2023; 98: 199-210 e10
- 24 Marya NB, Chandrasekhara V. Re-establishing the purpose of cholangioscopy-based artificial intelligence for biliary strictures. Gastrointest Endosc 2024; 99: 475-476
- 25 de Vries AB, van der Heide F, Ter Steege RWF. et al. Limited diagnostic accuracy and clinical impact of single-operator peroral cholangioscopy for indeterminate biliary strictures. Endoscopy 2020; 52: 107-114