GastroNet-5M: A Multicenter Dataset for Foundation Model Development in Gastrointestinal Endoscopy

M Jong; T Boers; K Fockens; J Jelmer; C Kusters; T Jaspers; R van Eijck van Heslinga; M Struyvenberg; R Bisschops; J Van Der Putten; PH N De With; F Van Der Sommen; J De Groof; J Bergman

doi:10.1055/s-0045-1805427

Subscribe to RSS

Please copy the URL and add it into your RSS Feed Reader.

https://www.thieme-connect.de/rss/thieme/en/10.1055-s-00000012.xml

Share / Bookmark

Facebook Linkedin Weibo

Endoscopy 2025; 57(S 02): S169-S170
DOI: 10.1055/s-0045-1805427

Abstracts | ESGE Days 2025

Oral presentation

Innovative techniques and devices in endoscopy 05/04/2025, 12:00 – 13:00 Room 118+119

GastroNet-5M: A Multicenter Dataset for Foundation Model Development in Gastrointestinal Endoscopy

M Jong

¹Amsterdam UMC, locatie VUmc, Amsterdam, Netherlands

,

T Boers

²Eindhoven University of Technology, Eindhoven, Netherlands

,

K Fockens

¹Amsterdam UMC, locatie VUmc, Amsterdam, Netherlands

,

J Jelmer

¹Amsterdam UMC, locatie VUmc, Amsterdam, Netherlands

,

C Kusters

²Eindhoven University of Technology, Eindhoven, Netherlands

,

T Jaspers

²Eindhoven University of Technology, Eindhoven, Netherlands

,

R van Eijck van Heslinga

¹Amsterdam UMC, locatie VUmc, Amsterdam, Netherlands

,

M Struyvenberg

¹Amsterdam UMC, locatie VUmc, Amsterdam, Netherlands

,

R Bisschops

³Uz leuven, leuven, Belgium

,

J Van Der Putten

²Eindhoven University of Technology, Eindhoven, Netherlands

,

PH N De With

²Eindhoven University of Technology, Eindhoven, Netherlands

,

F Van Der Sommen

²Eindhoven University of Technology, Eindhoven, Netherlands

,

J De Groof

¹Amsterdam UMC, locatie VUmc, Amsterdam, Netherlands

,

J Bergman

¹Amsterdam UMC, locatie VUmc, Amsterdam, Netherlands

› Author Affiliations

› Further Information

Also available at

Congress Abstract
Full Text

Aims Developing deep learning systems for medical imaging typically demands extensive datasets. Yet largescale collection of data and corresponding expert annotations remains challenging and costly. Foundation models, which are large pretrained models that capture broad, transferable knowledge from domain-specific data (e.g. endoscopic imagery), have shown promise in addressing these data limitations. However, the field of endoscopy still lacks accessible datasets suitable for training such models. In this study, we describe new experiments with GastroNet-5M, a comprehensive dataset of 5,002,545 endoscopic images, and further explore its potential to support foundation model development in endoscopy.

Methods GastroNet-5M is composed of anonymized endoscopic images collected from eight Dutch hospitals between 2012 and 2020. Using a self-supervised learning approach, this dataset enabled the development of a foundation model for various endoscopic AI applications. In this study, we compared GastroNet-5M pretrained models against the current standard of ImageNet-pretrained models. First, the diagnostic performance of GastroNet-5M-pretrained models was compared to that of ImageNet-pretrained models across 11 endoscopic classification and segmentation tasks, such as Barrett’s neoplasia detection, colorectal polyp characterization, and gastric cancer invasion depth prediction. Following this, data efficiency was assessed by repeating these experiments with stepwise reductions in training set size to examine model performance with less data. Finally, robustness was evaluated by testing model performance against data heterogeneity (e.g. training and testing on different endoscope manufacturers) on 4 additional test datasets.

Results Models pretrained with GastroNet-5M demonstrated a significant performance increase, surpassing all ImageNet benchmark models across all endoscopic downstream tasks (p>0.001). On average, GastroNet-5M models displayed a 3.5% higher AUC score for classification tasks and a 11.5% higher Dice score for segmentation tasks compared to the ImageNet models. In addition, GastroNet-5M models required significantly less downstream training data for 10 out of 11 downstream tasks (p=0.10). Finally, GastroNet-5M models displayed higher classification scores across all 4 robustness test datasets.

Conclusions GastroNet-5M, a multicenter dataset of over 5 million unlabeled endoscopic images, offers a valuable resource for pretraining deep learning models in endoscopy. The use of GastroNet-5M enhances model accuracy, reduces required dataset size, and improves robustness of endoscopic AI systems. GastroNet-5M will be made publicly accessible for further research and development.

Publication History

Article published online:
27 March 2025

Georg Thieme Verlag KG
Oswald-Hesse-Straße 50, 70469 Stuttgart, Germany