Conventional small-bowel capsule endoscopy reading vs proprietary artificial intelligence auxiliary systems: Systematic review and meta-analysis

Pablo Cortegoso Valdivia; Stefano Fantasia; Stefano Kayali; Ulrik Deding; Noemi Gualandi; Mauro Manno; Ervin Toth; Xavier Dray; Shiming Yang; Anastasios Koulaouzidis

doi:10.1055/a-2544-2863

Endoscopy International Open, Inhaltsverzeichnis

CC BY 4.0 · Endosc Int Open 2025; 13: a25442863
DOI: 10.1055/a-2544-2863

Review

Conventional small-bowel capsule endoscopy reading vs proprietary artificial intelligence auxiliary systems: Systematic review and meta-analysis

Pablo Cortegoso Valdivia

¹Gastroenterology and Endoscopy Unit, University Hospital of Parma, Parma, Italy (Ringgold ID: RIN18630)

,

Stefano Fantasia

¹Gastroenterology and Endoscopy Unit, University Hospital of Parma, Parma, Italy (Ringgold ID: RIN18630)

,

Stefano Kayali

¹Gastroenterology and Endoscopy Unit, University Hospital of Parma, Parma, Italy (Ringgold ID: RIN18630)

,

Ulrik Deding

²Department of Clinical Research, University of Southern Denmark, Odense, Denmark (Ringgold ID: RIN6174)

³Department of Surgery, Odense University Hospital, Svendborg, Denmark (Ringgold ID: RIN11286)

,

Noemi Gualandi

⁴Gastroenterology and Digestive Endoscopy Unit, AUSL Modena, Carpi, Italy (Ringgold ID: RIN18067)

,

Mauro Manno

⁴Gastroenterology and Digestive Endoscopy Unit, AUSL Modena, Carpi, Italy (Ringgold ID: RIN18067)

,

Ervin Toth

⁵Department of Gastroenterology, Skåne University Hospital, Lund University, Malmoe, Sweden

,

Xavier Dray

⁶Center for Digestive Endoscopy, Saint Antoine Hospital, APHP, Sorbonne Université, Paris, France (Ringgold ID: RIN27063)

⁷Faculty of Health Sciences, University of Southern Denmark, Odense, Denmark (Ringgold ID: RIN6174)

,

Shiming Yang

⁸Department of Gastroenterology, The Second Affiliated Hospital, Third Military Medical University, Chongqing, China (Ringgold ID: RIN12525)

,

Anastasios Koulaouzidis

²Department of Clinical Research, University of Southern Denmark, Odense, Denmark (Ringgold ID: RIN6174)

⁹Department of Medicine, Svendborg Sygehus, Svendborg, Denmark (Ringgold ID: RIN53172)

¹⁰Surgical Research Unit, Odense University Hospital, Odense, Denmark (Ringgold ID: RIN11286)

¹¹Department of Gastroenterology, Pomeranian Medical University in Szczecin, Szczecin, Poland (Ringgold ID: RIN37805)

› Institutsangaben

Abstract

Volltext

als PDF herunterladen

Keywords

Endoscopy Small Bowel - Capsule endoscopy - Small bowel endoscopy - Artificial Intelligence

Introduction

Since the introduction of small-bowel capsule endoscopy (SBCE) about two decades ago, indications for its use in clinical practice have been gradually implemented, and it has been accepted as the gold standard diagnostic tool for investigating small bowel (SB) pathology [1]. SBCE reading is a time-consuming task requiring high concentration levels, making it prone to human-related performance errors [2]. Consequently, SBCE has become a fertile ground for application of artificial intelligence (AI) algorithms [3], which can provide automated detection (and possibly characterization) of lesions and reduce reading times while maintaining elevated performance measures (e.g., sensitivity and specificity). In addition, AI may play a role in training settings, enabling beginners to perform comparable (or even superior) to experienced readers [4] [5].

Technological advancement in recent decades has enabled integration of various machine learning (ML) models in medical devices [3], especially after introduction of “deep learning”: the latter refers to a subtype of ML characterized by deep neural networks (DNNs), whose structure consists of several neuronal layers. Specifically, convolutional neural networks (CNNs) are DNN structures widely used for medical image analysis [6]. Training in ML occurs in a supervised or unsupervised manner, employing algorithms to adjust model parameters for optimal performance iteratively. Supervised learning relies on ground truth data (training set) to train ML systems, using these data as a benchmark for accuracy. Once trained, ML systems can make informed decisions and automatically extract image features. CNN and recurrent neural networks (RNNs) are the most advanced deep learning models applied to SBCE [7].

In 2019, Ding et al. published the first report of a proprietary deep CNN algorithm integrated into one of the commercially available SBCE systems [8]. The CNN-based auxiliary model identified abnormalities with higher sensitivity levels and significantly shorter reading times than conventional analysis by experienced gastroenterologists. To date, NaviCam (AnX Robotica, Plano, Texas, United States) and OMOM (Jinshan Science & Technology, Chongqing, China) are the only capsule endoscopy systems to incorporate proprietary AI models, respectively named ProScan and SmartScan, both capable of selecting frames of interest within a video sequence, and regions of interest within these selected frames with SB abnormalities (https://www.anxrobotics.com/products/navicam-sb-capsule-system/; https://www.jinshangroup.com/product/omom-hd-capsule-endoscopy-camera/). Because other manufacturers are working on similar solutions, many published papers on integrated CNNs exist today. Considering the large number of practical, non-branded, “home-grown” AI models for SBCE proposed by various authors [9], the literature lacks a comprehensive review of the available, marketed, proprietary AI platforms in this specific setting. Consequently, this meta-analysis aimed to provide an up-to-date review of current performance of AI auxiliary reading platforms in SBCE compared with conventional (human-only) reading.

Methods

Data sources and search strategy

We conducted a systematic literature search in PubMed to identify all relevant studies in which the performance of proprietary AI software in detecting lesions was directly compared with standard reading by physicians. The primary outcome was evaluation of performance measures of both conventional and AI-assisted reading; the secondary outcome was assessing reduction in SBCE reading time using AI auxiliary platforms compared with conventional human reading. The last literature search was performed on July 18, 2024. A manual review of the reference list of included studies followed the electronic search. The complete search string is available in Supplementary Table 1.

Inclusion and exclusion criteria

Inclusion criteria were: 1) full-text articles; 2) articles reporting performance measures of both conventional and proprietary AI-assisted reading in lesion detection; and 3) articles in the English language. Exclusion criteria were article types such as reviews/systematic reviews, editorials/perspectives/opinion pieces, individual case reports, letters to the editor, and commentaries.

Screening of references

After excluding duplicates, three authors independently screened references (P.C.V., S.F., S.K.). Each author screened two-thirds of the references (title and abstract) according to the inclusion and exclusion criteria. In case of discrepancy, an article was included for full-text evaluation. This approach was repeated on included references with three authors' assessment of the full text (P.C.V., S.F., S.K.). In case of discrepancy in the full-text evaluation, the third author would also evaluate the reference, and a consensus discussion among all three would determine the outcome.

Data extraction

Data were extracted following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) [15]. We extracted data on the SBCE model, type of AI auxiliary platform, number of patients and images in both the training and validation phases, per-patient and per-lesion analysis performance measures in conventional and AI-assisted reading, number of analyzed images, and reading times. Only data regarding the SB were extracted in studies assessing segments other than SB (e.g., stomach).

Study assessment and risk of bias

The included studies underwent assessment of methodological transparency by two independent reviewers (P.C.V., S.F.) using the Methodological Index for Non-randomized Studies (MINORS) assessment tool [16]. Studies achieving over two-thirds of the maximum achievable score (24 for comparative studies) were considered highly transparent in methodology.

Statistics

Statistics reported from each included study were used to visualize test performance of conventional and AI-assisted SBCE reading in crosshair and forest plots, including sensitivity, specificity, and false-positive rates. The number of true positives, true negatives, false positives, and false negatives from each included study was stratified by conventional and AI-assisted SBCE readings, either extracted directly from the studies or deduced from the available counts of total patient videos, total positive patient videos, and per-patient sensitivity. These quantities were used as input to calculate individual and pooled diagnostic odds ratios (ORs) as a comparative measure of test performance between conventional and AI-assisted SBCE readings. A random effects model was employed because no assumption of a common true effect size across the included studies could be made. In cases where input cells equaled zero, 0.5 was used for continuity corrections as by default in the utilized mada package in R. To ease reading and interpretation of the diagnostic ORs, we reported and visualized the log diagnostic ORs by forest plots including the pooled estimates, stratified by conventional and AI-assisted reading. The Cochranes Q and Higgins I² were employed to investigate the degree of heterogeneity between the included studies [17] [18] [19].

Results

Overall, 669 references were identified in the initial search. Abstract screening excluded 565 records, leaving 104 references for full-text reading. Six studies (n = 6) were eventually included [8] [10] [11] [12] [13] [14], all of which reported comparisons on conventional and AI-assisted reading performance measures, enabling them to be included in pooled estimates ([Fig. 1]). MINORS scores ranged from 16 to 24, highlighting high methodological transparency in all included studies ([Table 1]). Validation procedures for each included study are depicted in [Table 2].

Fig. 1 Flow diagram of the study. AI, artificial intelligence; CAC, computed assessment of cleansing; SBCE, small-bowel capsule endoscopy.

Table 1 Characteristics of included studies.
Study (year) [ref]	Study type	Software	AI model	Type of lesion	Training data		Validation data		MINORS (0–24)
Study (year) [ref]	Study type	Software	AI model	Type of lesion	Patients (n)	Images (n)	Videos (n)	Images (n)	MINORS (0–24)
AI, artificial intelligence; CNN, convolutional neural network; MINORS, Methodological Index for Non-randomized Studies; NR, not reported; YOLO, (you only look once) algorithm.
Ding Z. (2019) [8]	Multicenter Retrospective	NaviCam ESview	CNN (ProScan)	Any	1970	158,235	5000	113,268,334	17
Xie X. (2022) [10]	Multicenter Retrospective	OMOM VUE smart	CNN (YOLO) (Smart Scan)	Any	2927	757,770	2898	146,956,145	18
Ding Z. (2023) [11]	Multicenter Retrospective	NaviCam ESView	CNN + CRNN (ProScan)	Any	2565	280,426	240	5,741,518	16
O’Hara FJ. (2023) [12]	Unicenter Retrospective	OMOM VUE smart	CNN (YOLO) (Smart Scan)	Any	NR	NR	39	NR	19
Spada C. (2024) [13]	Multicenter Prospective	NaviCam ESView	CNN (ProScan)	Bleeding lesions	NR	NR	133	NR	24
Xie X. (2024) [14]	Multicenter Retrospective	OMOM VUE smart	CNN (YOLO) (Smart Scan)	Any	1069	40,508	342	NR	18

Table 2 Validation procedures for included studies.
Study (year) [ref]	Validation procedure
AI, artificial intelligence.
Ding Z. (2019) [8]	5000 videos were distributed to 20 expert gastroenterologists (250 each) for full conventional reading. Abnormal images detected by AI-assisted reading were revied manually by the same readers and a diagnosis was made. In case of discrepancy (conventional vs. AI-assisted), a final diagnosis was made after consensus among all 20 gastroenterologists. Final consensus diagnosis was considered the gold standard.
Xie X. (2022) [10]	Stage 1: 2898 videos were distributed to 8 experienced gastroenterologists [>200 cases/year] (about 362 each) for full conventional reading. Stage 2 (after 6 months): AI-assisted reading by the same gastroenterologists. Stage 3 (after 3 months): 3 expert readers [>800 cases/year] provided adjudication on discordant cases. The combined agreed comparator was formed by concordant findings (stage 1&2) and discordant findings adjudicated by the group of expert readers (stage 3).
Ding Z. (2023) [11]	240 videos were distributed to expert gastroenterologists for full conventional reading. Abnormal images detected by AI-assisted reading were reviewed manually by the same readers and a diagnosis was made. In case of discrepancy (conventional vs. AI-assisted), a final diagnosis was made after consensus among all 20 gastroenterologists. Final consensus diagnosis was considered the gold standard.
O’Hara FJ. (2023) [12]	40 videos were distributed to 2 experienced gastroenterologists for full conventional reading - results were revied by an expert group (3 experienced [50–100 cases/year] and 2 expert [> 200 cases/year for > 15 years] gastroenterologists). AI-assisted reading was performed by same readers 3 months later, and each image selected by the algorithm was evaluated for findings.
Spada C. (2024) [13]	Phase 1: investigators performed full conventional reading of videos at the site of patient’s enrolment (133 videos). Phase 2: after anonymization, videos were randomly reallocated to an external center for a second, AI-assisted reading. Phase 3: a board of 5 experts (> 500 cases) reviewed all videos to compare the results and to evaluate the match of findings of phases 1&2. In case of discrepancy, the board consensus reassessment was considered the gold standard.
Xie X. (2024) [14]	Stage 1: 342 videos were distributed to experienced gastroenterologists [> 200 cases/year for 10 years] (342 each) for full conventional reading. Stage 2 (after 5 months): AI-assisted reading by the same gastroenterologists. Stage 3: 3 senior readers [> 300 cases/year for 15 years] provided adjudication on discordant cases. The combination of concordant findings (stages 1&2) and discordant findings adjudicated by the senior readers (stage 3) was considered the reference standard.

Performance comparison—per-patient analysis

The number of true positives, true negatives, false positives, and false negatives stratified by reading measure is reported for each study in [Table 3]. Overall per-patient performance values were not reported in two studies [11] [12]. The false-positive rate appears to be comparable between conventional and AI-assisted reading ([Fig. 2]), whereas the sensitivity is markedly improved with AI reading compared with conventional reading. Sensitivity values in AI-assisted reading are 1.0, 0.99, 0.93, and 0.98, respectively, compared with 0.75, 0.88, 0.79, and 0.89 with conventional reading. Notably, no difference was found in specificity, because its values and related confidence intervals (CI) were identical ([Fig. 3]). Pooled estimates of the log diagnostic OR were 7.4 (CI 95% 5.7–9.2) for conventional reading and 10.3 (CI 95% 7.1–13.5) for AI-assisted reading. Prediction intervals were 4.95 to 9.87 for conventional reading and 4.40 to 16.16 for AI-assisted reading. For all four studies included in the per-patient pooled analysis [8] [10] [13] [14], the log diagnostic OR estimate was higher with AI-assisted reading than with conventional reading, with overlapping CIs ([Fig. 4]). No substantial heterogeneity was observed in either the pooled analysis for conventional reading (Cochran's Q: 3.179 (3 df, P = 0.365), Higgins' I²: 5.6 %) or AI-assisted reading (Cochran's Q: 3.394 (3 df, P = 0.335), Higgins' I²: 11.6%.

Table 3 Quantitative input used to estimate diagnostic odds ratios for pooled meta-analysis (per-patient analysis).
Study	True positives		False positives		False negatives		True negatives
Study	Conventional	AI	Conventional	AI	Conventional	AI	Conventional	AI
AI, artificial intelligence.
Ding, 2019	2443	3272	0	0	833	4	1724	1724
Xie, 2022	2048	2298	0	0	278	28	572	572
Spada, 2024	83	98	0	0	22	7	28	28
Xie, 2024	191	212	1	0	24	3	126	126

Fig. 2 Crosshair plots. For each reading measure visualizing the sensitivity and false positivity rate estimates for each study included in the pooled per-patient analysis - the width of the whiskers indicates the study sample size (weight).

Fig. 3 Forest plots of sensitivity and specificity for each study included in the pooled per-patient analysis, stratified by reading measure.

Fig. 4 Pooled estimates of log diagnostic odds ratios for each reading measure (per-patient analysis).

Performance comparison—per-lesion analysis

Five studies reported per-lesion performance indicators. Because the definition of pathological findings differed significantly among the included studies, we only considered overall values. Regarding accuracy and sensitivity, AI-assisted reading achieved statistically significantly higher results than conventional reading, as demonstrated by the P values ([Table 4]). Regarding specificity, AI-assisted reading obtained 100% and 97.1%, respectively [8] [11], compared with 100% for conventional reading (expert consensus was the comparator).

Table 4 Overall performance values (per-lesion analysis).
Study	Performance value	Conventional	AI-assisted	P value
AI, artificial intelligence; NR, not reported.
Ding, 2019	Accuracy	54.57%	70.91%	NR
	Sensitivity	76.89%	99.90%	< 0.0001
	Specificity	100%	100%	> 0.99
Xie, 2022	Accuracy	76.10%	95.90%	< 0.001
	Sensitivity	NR	NR	–
	Specificity	NR	NR	–
Ding, 2023	Accuracy	96.6%	97.9%	NR
	Sensitivity	91.1%	99.2%	< 0.0125
	Specificity	100%	97.1%	< 0.0125
O'Hara, 2023	Accuracy	NR	NR	–
	Sensitivity	86.2%	98.1%	< 0.001
	Specificity	NR	NR	–
Xie, 2024	Accuracy	84.79%	97.24%	< 0.001
	Sensitivity	NR	NR	–
	Specificity	NR	NR	-–

Image and reading time reduction

AI software provided a net reduction in number of images to review. In detail, the decrease in mean number of images from the four (n = 4) studies providing the data was 39-fold [8], 36-fold [10], 24-fold [13], and 51-fold [14]. This aspect was then translated to reduced reading time: AI-assisted reading required a mean time of 4.7 minutes, whereas conventional reading required 56.7 minutes ([Fig. 5]).

Fig. 5 Reading times. Red color: conventional reading; green color: AI-assisted reading.

Discussion

One of the primary concerns for gastroenterologists when analyzing images from lengthy SBCE videos is risk of missing lesions, because these may appear in only a few of the tens of thousands of frames that comprise the entire video sequence. This aspect is burdened by the time required to review a recording, which may decrease with experience but is admittedly associated with the potential for missing lesions [20]. A seminal paper by Beg et al. showed that reader accuracy declines after just one SBCE reading [2]. Therefore, the time required to report each exam is crucial to ensure reading quality, especially when multiple exams are reviewed consecutively. In addition, disappointing results in terms of interobserver/intraobserver agreement and detection rate of significant findings, regardless of SBCE reader experience, have been underscored in previous works [21]. A paper by Rondonotti et al. showed that dedicated training programs did not significantly increase performance of readers with different levels of experience [22].

The results of our study highlight that AI-assisted reading provides superior diagnostic performance in terms of accuracy and sensitivity in both per-lesion and per-patient analyses across all included studies, showing a higher diagnostic OR for AI-assisted reading compared with conventional reading (10.3 versus 7.4) despite overlapping CIs. This finding reassures clinicians about reduced risk of missing pathology. It confirms that AI software fulfills its intended purpose: allowing the reader to focus solely on the most ambiguous lesions by filtering out the “noise” of thousands of negative images. In this context, our results support the role of AI-assisted reading in paving a new era of reduced reviewing times. The mean SBCE reading time achieved with AI assistance was 12 times shorter than that required for conventional reporting (4.7 versus 56.7 minutes, respectively). This opens new potential perspectives, not only in terms of performance quality but also in terms of healthcare costs. Future studies should be conducted to explore this aspect further.

Interestingly, AI auxiliary platforms may also play a role in closing the gap between novice and expert readers in SBCE reading. According to the results of Ding et al. and Xie et al. [11] [14], incremented sensitivity for SB lesions of AI-assisted junior readers even surpassed that of experts in conventional reading mode (99.2% and 96.7% versus 91.1% and 88.8%, respectively). Compared with conventional reading, in the study by Ding et al., the same novice readers obtained a reduction of 33.3% in missed diagnosis rate (34.1% conventional, 0.8% AI-assisted [11]. These data underscore the clinical importance of AI auxiliary reading in training by improving efficiency and work performance.

Our study has several limitations, primarily related to characteristics of the included studies. First, only six studies were included, five of which were retrospective. Second, in the per-patient analysis, aggregated data included only four of six studies, potentially reducing the strength of the pooled analysis. Third, data pooling was impossible in the per-lesion analysis due to heterogeneity in the definition of pathological findings among the studies; therefore, only overall lesion values were compared with their respective conventional reading counterparts. Fourth, an intrinsic clinical limitation concerns the comparator. Because all studies considered expert consensus/board the ground truth in cases of uncertainty, one must consider inherent agreement variations among observers (even if experts). On the other hand, deep enteroscopy is the only third-party method capable of addressing this limitation, and it cannot be performed on all patients for obvious ethical and technical reasons.

However, several strengths of this work should be highlighted. To our knowledge, this is the first systematic review with a pooled analysis addressing the role of proprietary AI models in diagnostic workup of SBCE published in the literature. The decision to focus exclusively on proprietary software may be subject to criticism and praise. However, given our aim to provide a practical perspective for clinicians, concentrating on available AI systems—readily reproducible and globally applicable—represents a key strength of our study. Furthermore, our findings are bolstered by high reliability due to the low heterogeneity level observed in the pooled analyses.

Conclusions

In conclusion, compared with conventional video review, AI-assisted reading shows superior diagnostic performance by increasing detection accuracy and sensitivity, remarkably reducing reading times. Nevertheless, caution is advised: auxiliary AI tools cannot yet fully replace expert human reading, especially due to ethical aspects. At this stage, our findings advocate for widespread adoption of AI auxiliary software to assist clinicians in SBCE reading, easing the workload of endoscopy services and exerting a supportive role in the training curve for novice readers.

Bibliographical Record
Pablo Cortegoso Valdivia, Stefano Fantasia, Stefano Kayali, Ulrik Deding, Noemi Gualandi, Mauro Manno, Ervin Toth, Xavier Dray, Shiming Yang, Anastasios Koulaouzidis. Conventional small-bowel capsule endoscopy reading vs proprietary artificial intelligence auxiliary systems: Systematic review and meta-analysis. Endosc Int Open 2025; 13: a25442863.
DOI: 10.1055/a-2544-2863

Referenzen

References
1 Pennazio M, Rondonotti E, Despott EJ. et al. Small-bowel capsule endoscopy and device-assisted enteroscopy for diagnosis and treatment of small-bowel disorders: European Society of Gastrointestinal Endoscopy (ESGE) Guideline - Update 2022. Endoscopy 2023; 55: 58-95
2 Beg S, Card T, Sidhu R. et al. The impact of reader fatigue on the accuracy of capsule endoscopy interpretation. Dig Liver Dis 2021; 53: 1028-33
3 Messmann H, Bisschops R, Antonelli G. et al. Expected value of artificial intelligence in gastrointestinal endoscopy: European Society of Gastrointestinal Endoscopy (ESGE) Position Statement. Endoscopy 2022; 54: 1211-31
4 Beg S, Wronska E, Araujo I. et al. Use of rapid reading software to reduce capsule endoscopy reading times while maintaining accuracy. Gastrointest Endosc 2020; 91: 1322-1327
5 Sidhu R, Chetcuti Zammit S, Baltes P. et al. Curriculum for small-bowel capsule endoscopy and device-assisted enteroscopy training in Europe: European Society of Gastrointestinal Endoscopy (ESGE) Position Statement. Endoscopy 2020; 52: 669-686
6 Dray X, Iakovidis D, Houdeville C. et al. Artificial intelligence in small bowel capsule endoscopy - current status, challenges and future promise. J Gastroenterol Hepatol 2021; 36: 12-19
7 Qin K, Li J, Fang Y. et al. Convolution neural network for the diagnosis of wireless capsule endoscopy: a systematic review and meta-analysis. Surg Endosc 2022; 36: 16-31
8 Ding Z, Shi H, Zhang H. et al. Gastroenterologist-level identification of small-bowel diseases and normal variants by capsule endoscopy using a deep-learning model. Gastroenterology 2019; 157: 1044-1054.e5
9 Dhali A, Biswas J, Kipkorir V. et al. Tu2027 Artificial intelligence-assisted versus conventional capsule endoscopy for detection of small bowel lesion-systematic review and meta-analysis. Gastroenterology 2024; 166: S-1498
10 Xie X, Xiao Y-F, Zhao X-Y. et al. Development and validation of an artificial intelligence model for small bowel capsule endoscopy video review. JAMA Netw Open 2022; 5: e2221992
11 Ding Z, Shi H, Zhang H. et al. Artificial intelligence-based diagnosis of abnormalities in small-bowel capsule endoscopy. Endoscopy 2023; 55: 44-51
12 O’Hara FJ, Mc Namara D. Capsule endoscopy with artificial intelligence-assisted technology: Real-world usage of a validated AI model for capsule image review. Endosc Int Open 2023; 11: E970-E975
13 Spada C, Piccirelli S, Hassan C. et al. AI-assisted capsule endoscopy reading in suspected small bowel bleeding: a multicentre prospective study. Lancet Digit Health 2024; 6: e345-53
14 Xie X, Xiao Y-F, Yang H. et al. A new artificial intelligence system for both stomach and small-bowel capsule endoscopy. Gastrointest Endosc 2024; 100: 878.e1-878.e14
15 Liberati A, Altman DG, Tetzlaff J. et al. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate healthcare interventions: explanation and elaboration. BMJ 2009; 339: b2700
16 Slim K, Nini E, Forestier D. et al. Methodological index for non-randomized studies (minors): development and validation of a new instrument. ANZ J Surg 2003; 73: 712-716
17 Doebler P, Sousa-Pinto B. mada: Meta-analysis of diagnostic accuracy. https://doi.org/10.32614/CRAN.package.mada
18 Glas AS, Lijmer JG, Prins MH. et al. The diagnostic odds ratio: a single indicator of test performance. J Clin Epidemiol 2003; 56: 1129-1135
19 Phillips B, Stewart LA, Sutton AJ. “Cross hairs” plots for diagnostic meta-analysis. Res Synth Methods 2010; 1: 308-315
20 Rondonotti E, Pennazio M, Toth E. et al. How to read small bowel capsule endoscopy: a practical guide for everyday use. Endosc Int Open 2020; 8: E1220-E224
21 Cortegoso Valdivia P, Deding U, Bjørsum-Meyer T. et al. Inter/intra-observer agreement in video-capsule endoscopy: Are we getting it all wrong? A systematic review and meta-analysis. Diagnostics (Basel) 2022; 12: 2400
22 Rondonotti E, Soncini M, Girelli CM. et al. Can we improve the detection rate and interobserver agreement in capsule endoscopy?. Dig Liver Dis 2012; 44: 1006-1011

Abbildungen

Fig. 1 Flow diagram of the study. AI, artificial intelligence; CAC, computed assessment of cleansing; SBCE, small-bowel capsule endoscopy.

Fig. 2 Crosshair plots. For each reading measure visualizing the sensitivity and false positivity rate estimates for each study included in the pooled per-patient analysis - the width of the whiskers indicates the study sample size (weight).

Fig. 3 Forest plots of sensitivity and specificity for each study included in the pooled per-patient analysis, stratified by reading measure.

Fig. 4 Pooled estimates of log diagnostic odds ratios for each reading measure (per-patient analysis).

Fig. 5 Reading times. Red color: conventional reading; green color: AI-assisted reading.

Zusatzmaterial

Zusatzmaterial