Endoscopy 2024; 56(S 02): S79
DOI: 10.1055/s-0044-1782860
Abstracts | ESGE Days 2024
Oral presentation
New Frontiers in Barretts esophagus surveillance 26/04/2024, 11:30 – 12:30 Room 8

Image quality pitfalls in AI: Safeguarding Barrett's neoplasia detection with robust deep learning training strategies

M. Jong
1   Amsterdam UMC, locatie VUmc, Amsterdam, Netherlands
,
T. Jaspers
2   Eindhoven University of Technology, Eindhoven, Netherlands
,
C. Kusters
2   Eindhoven University of Technology, Eindhoven, Netherlands
,
J. Jukema
1   Amsterdam UMC, locatie VUmc, Amsterdam, Netherlands
,
K. Fockens
1   Amsterdam UMC, locatie VUmc, Amsterdam, Netherlands
,
R. van Eijck van Heslinga
3   VU University Medical Center, Amsterdam, Netherlands
,
T. Boers
2   Eindhoven University of Technology, Eindhoven, Netherlands
,
F. Van Der Sommen
2   Eindhoven University of Technology, Eindhoven, Netherlands
,
P. De With
2   Eindhoven University of Technology, Eindhoven, Netherlands
,
J. De Groof
1   Amsterdam UMC, locatie VUmc, Amsterdam, Netherlands
,
J. Bergman
3   VU University Medical Center, Amsterdam, Netherlands
› Institutsangaben
 
 

    Aims Endoscopic artificial intelligence systems, developed in expert centers with high-quality imaging, may underperform in community hospitals due to image quality heterogeneity. This study aimed to quantify the performance degradation of a CADe system for Barrett’s neoplasia, when exposed to the heterogeneous imaging conditions of community hospitals. Subsequently, different state-of-the-art training strategies were evaluated to mitigate this performance loss.

    Methods We developed a CADe system using a high-quality, expert-acquired training set comprising 437 images from 173 neoplastic Barrett’s patients and 574 images from 200 non-dysplastic Barrett’s esophagus patients. We assessed its performance on high, moderate and low-quality test sets, each containing 120 images derived from the same group of 65 neoplastic Barrett’s patients and 55 non-dysplastic Barrett’s patients. These test sets were completely independent from the training set and simulated the heterogeneous image quality of community hospitals. We then applied four robustness enhancing strategies: diversified training data, domain-specific pretraining, targeted data augmentation, and architectural optimization.

    Results The CADe system, when trained exclusively on high-quality data, achieved an AUC score of 82% on the high-quality test set. AUC scores were significantly lower on the moderate (79%; p<0.001) and low-quality (70%; p<0.001) test sets. Incorporating robustness enhancing strategies significantly improved the AUC to 93% for high-quality (p=0.020), 94% for moderate-quality (p=0.006), and 84% for low-quality test sets (p=0.002). These robustness enhancing strategies also led to a significantly decreased performance drop on the moderate (+1% vs -3%; p<0.001) an low-quality test sets (-9% vs -12%; p=0.004).

    Conclusions CADe systems that are trained solely on high-quality images may not perform well on the variable image quality found in routine clinical practice. However, in this study we show that the use of state-of-the-art robustness enhancing strategies can significantly improve its robustness and absolute performance, increasing the likelihood of successful implementation of artificial intelligence systems in clinical practice.


    Conflicts of interest

    Authors do not have any conflict of interest to disclose.

    Publikationsverlauf

    Artikel online veröffentlicht:
    15. April 2024

    © 2024. European Society of Gastrointestinal Endoscopy. All rights reserved.

    Georg Thieme Verlag KG
    Rüdigerstraße 14, 70469 Stuttgart, Germany