Endoscopy 2026; 58(02): 121-129
DOI: 10.1055/a-2661-2624
Original article

Long-term impact of computer-aided adenoma detection: a prospective observational study

Authors

  • Taishi Okumura

    1   Digestive Disease Center, Showa Medical University Northern Yokohama Hospital, Kanagawa, Japan
  • Shin-ei Kudo

    1   Digestive Disease Center, Showa Medical University Northern Yokohama Hospital, Kanagawa, Japan
  • Yutaro Ide

    1   Digestive Disease Center, Showa Medical University Northern Yokohama Hospital, Kanagawa, Japan
  • Shun Kato

    1   Digestive Disease Center, Showa Medical University Northern Yokohama Hospital, Kanagawa, Japan
  • Yuki Miyata

    1   Digestive Disease Center, Showa Medical University Northern Yokohama Hospital, Kanagawa, Japan
  • Kazumi Takisima

    1   Digestive Disease Center, Showa Medical University Northern Yokohama Hospital, Kanagawa, Japan
  • Yuki Takashina

    1   Digestive Disease Center, Showa Medical University Northern Yokohama Hospital, Kanagawa, Japan
  • Yosuke Minegishi

    1   Digestive Disease Center, Showa Medical University Northern Yokohama Hospital, Kanagawa, Japan
  • Masahiro Abe

    1   Digestive Disease Center, Showa Medical University Northern Yokohama Hospital, Kanagawa, Japan
  • Tatsuya Sakurai

    1   Digestive Disease Center, Showa Medical University Northern Yokohama Hospital, Kanagawa, Japan
  • Yuta Koyama

    1   Digestive Disease Center, Showa Medical University Northern Yokohama Hospital, Kanagawa, Japan
  • Kazuki Kato

    1   Digestive Disease Center, Showa Medical University Northern Yokohama Hospital, Kanagawa, Japan
  • Yasuharu Maeda

    1   Digestive Disease Center, Showa Medical University Northern Yokohama Hospital, Kanagawa, Japan
  • Yushi Ogawa

    1   Digestive Disease Center, Showa Medical University Northern Yokohama Hospital, Kanagawa, Japan
  • Katsuro Ichimasa

    1   Digestive Disease Center, Showa Medical University Northern Yokohama Hospital, Kanagawa, Japan
  • Noriyuki Ogata

    1   Digestive Disease Center, Showa Medical University Northern Yokohama Hospital, Kanagawa, Japan
  • Takemasa Hayashi

    1   Digestive Disease Center, Showa Medical University Northern Yokohama Hospital, Kanagawa, Japan
  • Kunihiko Wakamura

    1   Digestive Disease Center, Showa Medical University Northern Yokohama Hospital, Kanagawa, Japan
  • Toshiyuki Baba

    1   Digestive Disease Center, Showa Medical University Northern Yokohama Hospital, Kanagawa, Japan
  • Tetsuo Nemoto

    2   Department of Diagnostic Pathology and Laboratory Medicine, Showa Medical University, Northern Yokohama Hospital, Kanagawa, Japan (Ringgold ID: RIN220878)
  • Hideyuki Miyachi

    1   Digestive Disease Center, Showa Medical University Northern Yokohama Hospital, Kanagawa, Japan
    3   Department of Gastroenterology, Kochi University, Kochi, Japan (Ringgold ID: RIN12888)
  • Hayato Itoh

    4   Graduate School of Informatics, Nagoya University, Nagoya, Japan
    5   Department of Applied Mathematics, Faculty of Science, Fukuoka University, Fukuoka, Japan (Ringgold ID: RIN12774)
  • Masahiro Oda

    4   Graduate School of Informatics, Nagoya University, Nagoya, Japan
    6   Information Technology Center, Nagoya University, Nagoya, Japan
  • Kensaku Mori

    4   Graduate School of Informatics, Nagoya University, Nagoya, Japan
    6   Information Technology Center, Nagoya University, Nagoya, Japan
  • Masashi Misawa

    1   Digestive Disease Center, Showa Medical University Northern Yokohama Hospital, Kanagawa, Japan
  • on behalf of the CADe colonoscopy working group

Clinical Trial:

Registration number (trial ID): UMIN000040677, Trial registry: UMIN Japan (http://www.umin.ac.jp/english/), Type of Study: Prospective Observational Study


 


Graphical Abstract

Abstract

Background

Computer-aided detection (CADe) systems have improved the adenoma detection rate (ADR); however, concerns about its long-term effect on endoscopists’ performance without CADe and potential deskilling remain unaddressed. This study evaluated the impact of CADe on the learning curve of endoscopists for adenoma detection.

Methods

This propensity score-matching, prospective, single-center, observational study was conducted from January 2021 to December 2023. CADe systems were installed in half of the endoscopy units, and patients were equally distributed between the rooms. Patients aged ≥20 years scheduled for colonoscopy were included, excluding those with polyposis, inflammatory bowel disease, known polyps, incomplete colonoscopy, emergency cases, or previous colorectal surgery, and those examined by novices. Endoscopists were classified as high detectors (ADR ≥25%) or low detectors (ADR <25%) based on the ADR recorded before CADe implementation. To assess skill acquisition and transfer, the primary outcome was the change in ADR over time, as measured by cumulative summation (CUSUM) analysis, in both CADe and non-CADe procedures.

Results

Of 18 962 patients who underwent colonoscopy, 13 245 patients were excluded, and of the 5717 patients initially enrolled, 4712 (CADe group, n = 2356; non-CADe group, n = 2356) were analyzed after propensity score matching. CUSUM analysis showed that both high and low detectors achieved enhanced detection performance for CADe procedures. Among non-CADe procedures, high detectors had accelerated learning curves, indicating they maintained a higher ADR, whereas low detectors showed no significant change in their learning trajectory.

Conclusions

After CADe implementation, the detection rate in procedures performed without CADe was maintained and did not decline over time.



Introduction

Colorectal cancer (CRC) is a major cause of cancer-related morbidity and mortality worldwide. The adenoma detection rate (ADR) is considered an important quality indicator of screening colonoscopy because it is inversely associated with CRC mortality [1] [2]. Computer-aided detection (CADe) systems have emerged as significant contributors to an improved ADR in this critical procedure [3] [4]. A recent systematic review and meta-analysis of randomized controlled trials (RCTs) reported that CADe improved the ADR compared with standard colonoscopy and was effective at decreasing the adenoma miss rate [5] [6] [7]. These findings collectively underscore the potential of CADe systems to significantly improve the quality and effectiveness of colonoscopic examinations in real-world clinical practice.

Despite the demonstrated benefits of CADe in improving ADR, a significant knowledge gap exists regarding its long-term impact on the performance of endoscopists, especially those who transition from non-CADe-assisted to CADe-assisted colonoscopies beyond this initial period. Furthermore, it remains unknown how the ADR of endoscopists performing traditional procedures without CADe changes over an extended timeframe when they begin using CADe systems. Understanding whether CADe potentially enhances or diminishes the native detection abilities of endoscopists is critical for optimizing its implementation in clinical practice. To address this knowledge gap, we conducted a study to evaluate the long-term impact of CADe on the learning curve of endoscopists for adenoma detection, with particular attention given to whether the system led to skill enhancement or potential deskilling.


Methods

Study design and patients

This prospective nonblinded single-center study was performed at Showa University Northern Yokohama Hospital between January 2021 and December 2023. The inclusion criterion was patients aged ≥20 years who were scheduled for colonoscopy. Exclusion criteria were: patients with polyposis, inflammatory bowel disease, or known polyps, incomplete total colonoscopy owing to stricture or obstructive cancer, emergency colonoscopy, or previous colorectal surgery, and those examined by a novice (defined as an endoscopist with less than 1 year of colonoscopy experience) and those who refused to participate in the study.

This study was approved by the ethics review board of Showa University Northern Yokohama Hospital on February 18, 2020 (approval no. 19H072) and was conducted in compliance with the Declaration of Helsinki. We obtained consent from all participants. All authors had access to the study data, and reviewed and approved the final manuscript.


Instruments used in this study

We used a high definition video endoscope system (EVIS X1 systems; Olympus Corporation, Tokyo, Japan) and a high vision endoscope (PCF-H290Z, CF-H290ECI, CF-XZ1200, or CF-EZ1500; Olympus Corporation) in this study. We used EndoBRAIN-EYE (Cybernet Systems Corp., Tokyo, Japan) as the CADe software. The CADe is a Japanese regulatory-approved software that has been used as a medical device since 2020 and has since been launched in multiple Asian countries. The CADe system uses a deep learning-based algorithm (Yolov3). If the system detects a colorectal polyp, it draws a bounding box around the inferred polyp area, emits an alert sound, and places colors at the four corners of the endoscopic image (Fig. 1s, see online-only Supplementary material). The algorithm and training samples have been previously reported [8] [9]. Recent updates to the CADe system in April 2022 have significantly reduced the false-positive detection rate [10].


Colonoscopy procedure

All endoscopists, except one (M.M.), who developed the CADe system, received a standardized, approximately 30-minute instructional session prior to study initiation. The session included: (a) an explanation of the CADe system’s functions and interface; (b) demonstration videos; and (c) a question and answer session.

We have been using the CADe in our daily clinical practice since 2020. Of the eight colonoscopy rooms in our institution, we set up the CADe system in four of them. Patients were distributed equally to each room; however, if the room was not available because of a delay in the previous colonoscopy, the next patients were allocated at the discretion of the medical staff overseeing the availability of the other rooms. Patients who were allocated to CADe rooms were examined by endoscopists with the support of CADe (CADe group). Patients who were allocated to rooms without CADe were examined conventionally (non-CADe group). Endoscopists were assigned to either CADe or non-CADe rooms by their endoscopy team leader on a daily basis, aiming for balanced exposure to both conditions. Although the allocation was intended to be as unbiased as possible, practical considerations such as room availability and clinical workflow sometimes required adjustments.

Before the colonoscopy began, patients were given 2–3 L of polyethylene glycol solution on the morning of the examination as bowel preparation. If an endoscopist detected a polyp, they diagnosed the lesion by evaluating it with narrow-band imaging and/or chromoendoscopic images. Endoscopists were encouraged to resect all endoscopically diagnosed neoplastic lesions.


Data collection

We investigated patient characteristics (age and sex), colonoscopy information (indications, bowel preparation, cecal intubation, withdrawal time, and endoscopist), and lesion information (size, morphology, location, and pathological findings). All data were retrieved from the electronic medical record database. The quality of the bowel preparation was recorded using the Aronchick bowel preparation scale [11], which contains four categories (i.e. excellent, good, fair, and poor).

Among the 53 participating endoscopists, 36 were experts with ≥1000 colonoscopies and 17 were nonexperts with <1000 colonoscopies before the implementation of CADe. All endoscopists were classified as high detectors (ADR ≥25%; n = 20) or low detectors (ADR <25%; n = 33) based on their ADR recorded before CADe implementation. This threshold of 25% was selected as it represents the minimum ADR recommended for quality colonoscopy [12] [13].

Lesion morphology was recorded according to the Paris classification [14]. The locations were classified as right-sided colon (at or proximal to the splenic flexure), left-sided colon (from the splenic flexure to the sigmoid colon), or rectal (from the rectosigmoid to the rectum).


Pathological diagnosis

The resected specimens were fixed in 10% buffered formalin and stained with hematoxylin and eosin. The stained specimens were examined by at least two pathologists certified by the Japanese Society of Pathology. The pathological diagnoses were based on the Japanese Classification of Colorectal, Appendiceal, and Anal Carcinomas [15] and the World Health Organization criteria [16].


Outcome measures

To assess skill acquisition and transfer, we measured changes in the ADR over time using cumulative summation (CUSUM) analysis for both CADe and non-CADe procedures. The aim was to assess the effect of CADe implementation on the learning curve for adenoma detection, including separate analyses for high detectors and low detectors to evaluate skill acquisition and transfer. The secondary outcomes were the ADR, number of detected adenomas per colonoscopy, advanced neoplasia detection rate, sessile serrated lesion detection rate (SSLDR), polyp detection rate (PDR), ADR by morphology, and the ADR and SSLDR trends. We also analyzed the number of polypectomies for non-neoplastic lesions per colonoscopy.

Detection rates were calculated as the proportion of patients with pathologically confirmed lesions among the total number of colonoscopy cases. Advanced neoplasia was defined as the presence of a high grade adenoma (intramucosal carcinoma), adenomatous lesion with a villous component, adenomatous lesion >10 mm, or invasive cancer. Non-neoplastic lesions were defined as lesions that showed neither adenomatous or SSL pathology.

In this prospective observational study, the primary and secondary outcomes were exploratory and intended to generate hypotheses rather than confirm findings.


Statistical analysis

To avoid potential bias, we conducted a propensity score-matching analysis using known factors (age, sex, indication, and bowel preparation) that may have affected the ADR. Propensity scores were calculated using a logistic regression analysis with a caliper width of 0.2, and the SD of the logit of the propensity score was used [17]. Categorical variables were expressed as frequencies (%) and compared using Fisher’s exact test. Continuous variables were expressed as medians with interquartile ranges (25th–75th percentiles) and analyzed using the Mann–Whitney U test.

The learning curve for adenoma detection was assessed using CUSUM analysis, a sequential technique that is widely used to monitor performance and detect changes in clinical outcomes and procedural efficiency in medical fields such as surgery and anesthesiology [18] [19] [20]. CUSUM analysis has also been applied to assess learning curves in endoscopic diagnosis and detection [21] [22]. This method is independent of predetermined sample sizes and enables continuous real-time evaluation of performance data, so formal sample size calculations are not required [20]. Detailed explanations of the CUSUM methodology are presented in Appendix 1s.

P < 0.05 was considered statistically significant. All statistical analyses were performed using EZR (version 1.56; Saitama Medical Center, Jichi Medical University, Saitama, Japan), a graphical user interface for R (The R Foundation for Statistical Computing, Vienna, Austria).



Results

Patient and lesion characteristics

During the study period, 18962 patients underwent either colonoscopy with the CADe system or routine colonoscopy, and 13245 patients were excluded in accordance with the exclusion criteria. Overall, 5717 patients were eligible for the study. After propensity score matching, 4712 patients (2356 in each group) were included in the final analysis ([Fig. 1]).

Zoom
Fig. 1 Study flowchart of patient inclusion and propensity score matching.

In the matched cohort, baseline patient characteristics were well balanced between the groups, confirming the adequacy of propensity score matching ([Table 1]). The median age was 67 years in both groups, and male patients comprised 57.9% of each group. Regarding indications, screening (21.4%), FIT-positive (13.3%), surveillance (44.4%), and symptomatic cases (20.9%) were distributed equally between the groups. Bowel preparation was adequate in 98.6% of patients in each group. Cecal intubation was achieved in 99.9% of patients in the CADe group and 99.8% of patients in the non-CADe group. The proportion of procedures performed by high and low detectors was similar between the groups: high detectors accounted for 46.6% (n = 1098) in the CADe group and 45.6% (n = 1075) in the non-CADe group, while low detectors accounted for 53.4% (n = 1258) and 54.4% (n = 1281), respectively.

Table 1 Characteristics of the patients in the computer-aided detection (CADe) and non-CADe groups in the full and matched cohorts.

Full cohort

Matched cohort

CADe (n = 3142)

Non-CADe (n = 2575)

P value1

CADe (n = 2356)

Non-CADe (n = 2356)

Standardized difference

P value1

FIT, fecal immunochemical test.

1 Using Fisher’s exact test, unless otherwise stated.

2 Using Mann–Whitney U test.

Age, median (25th–75th percentiles), years

66 (56–74)

68 (57–76)

<0.0012

67 (57–75)

67 (57–75)

0.002

0.932

Sex, male, n (%)

1875 (59.7)

1469 (57.0)

0.046

1363 (57.9)

1363 (57.9)

0.009

>0.99

Indication, n (%)

NA

<0.001

>0.99

  • Screening

800 (25.5)

521 (20.2)

505 (21.4)

505 (21.4)

  • FIT positive

463 (14.7)

330 (12.8)

314 (13.3)

314 (13.3)

  • Surveillance

1358 (43.2)

1083 (42.1)

1045 (44.4)

1045 (44.4)

  • Symptomatic

521 (16.6)

641 (24.9)

492 (20.9)

492 (20.9)

Bowel preparation, adequate, n (%)

3093 (98.4)

2471 (96.0)

<0.001

2322 (98.6)

2322 (98.6)

<0.001

>0.99

Cecum reached, n (%)

3140 (99.9)

2560 (99.4)

<0.001

2354 (99.9)

2352 (99.8)

0.02

0.69

Colonoscopies performed by high detectors, n (%)

1447 (46.1)

1151 (44.7)

0.31

1098 (46.6)

1075 (45.6)

0.02

0.52


CUSUM analysis

CUSUM analysis revealed distinct learning patterns among the high and low detectors ([Fig. 2] and [Fig. 3]). For the CADe-assisted procedures, high and low detectors showed patterns below the acceptable performance line (h0), indicating improved adenoma detection performance with artificial intelligence (AI) support. Among high detectors, their CUSUM plots also consistently fell below the acceptable performance line (h0) over time for non-CADe colonoscopies, indicating that they had successfully acquired and maintained improved adenoma detection skills even without AI assistance. This trend was particularly evident after the first 100 procedures, suggesting a relatively rapid period of skill acquisition. In contrast, the CUSUM plots of low detectors remained between the acceptable (h0) and unacceptable (h1) limits throughout the study period, showing no significant change in learning trajectory for the non-CADe procedures.

Zoom
Fig. 2 Cumulative summation (CUSUM) analysis for high detectors based on the adenoma detection rate during computer-aided detection (CADe) and non-CADe procedures (with and without AI, respectively). Performance was judged as follows: if the CUSUM plot fell below the acceptable line (h0), the performance was acceptable; if the CUSUM plot was above the unacceptable line (h1), the performance was unacceptable.
Zoom
Fig. 3 Cumulative summation (CUSUM) analysis for low detectors based on the adenoma detection rate during computer-aided detection (CADe) and non-CADe procedures (with and without AI, respectively). Performance was judged as follows: if the CUSUM plot fell below the acceptable line (h0), the performance was acceptable; if the CUSUM plot was above the unacceptable line (h1), the performance was unacceptable.

Per-patient analysis

The per-patient analysis is shown in [Table 2]. The CADe group had a significantly higher ADR compared with the non-CADe group (40.5% vs. 32.4%; P < 0.001). Analysis by detector type revealed that both high detectors (42.9% vs. 37.8%; P = 0.02) and low detectors (38.5% vs. 27.9%; P < 0.001) achieved higher ADRs with CADe assistance ([Table 3]). The number of detected adenomas per colonoscopy was significantly higher in the CADe group compared with the non-CADe group (0.80 [SD 1.38] vs. 0.59 [SD 1.23]; P < 0.001).

Table 2 Comparison of the computer-aided detection (CADe) and non-CADe groups (per-patient analysis).

CADe (n = 2356)

Non-CADe (n = 2356)

Relative risk (95%CI)

P value1

1 Using Fisher’s exact test unless otherwise stated.

2 Mann–Whitney U test.

Adenoma detection rate, n (%)

955 (40.5)

764 (32.4)

1.25 (1.16–1.35)

<0.001

Number of adenomas per colonoscopy, mean (SD)

0.80 (1.38)

0.59 (1.23)

NA

<0.0012

Advanced neoplasm detection rate, n (%)

97 (4.1)

110 (4.7)

1.13 (0.87–1.15)

0.39

Sessile serrated lesion detection rate, n (%)

162 (6.9)

106 (4.5)

1.53 (1.2–1.94)

0.001

Polyp detection rate, n (%)

1128 (47.9)

863 (36.6)

1.31 (1.22–1.4)

<0.001

Detection rate by morphologic type, n (%)

  • Protruded

483 (20.5)

396 (16.8)

1.22 (1.08–1.38)

0.001

  • Flat

840 (35.7)

631 (26.8)

1.33 (1.22–1.45)

<0.001

  • Depressed

7 (0.3)

5 (0.2)

1.4 (0.45–4.41)

0.77

Number of polypectomies for non-neoplastic lesions per colonoscopy, mean (SD)

0.13 (0.43)

0.08 (0.34)

NA

<0.0012

Table 3 Comparison of the computer-aided detection (CADe) and non-CADe rates among high and low detectors (per-patient analysis).

CADe (n = 2356)

Non-CADe (n = 2356)

Relative risk (95%CI)

P value1

1 Using Fisher’s exact test.

High detectors

(n = 1098)

(n = 1075)

Adenoma detection rate

471 (42.9)

406 (37.8)

1.14 (1.03–1.26)

0.02

Advanced neoplasm detection rate

42 (3.8)

59 (5.5)

1.44 (0.98–2.11)

0.07

Sessile serrated lesion detection rate

86 (7.8)

48 (4.5)

1.75 (1.25–2.47)

0.001

Polyp detection rate

518 (47.2)

425 (39.5)

1.19 (1.08–1.32)

<0.001

Low detectors

(n = 1258)

(n = 1281)

Adenoma detection rate

484 (38.5)

358 (27.9)

1.38(1.23–1.54)

<0.001

Advanced neoplasm detection rate

55 (4.4)

51 (4.0)

1.1 (0.76–1.6)

0.69

Sessile serrated lesion detection rate

76 (6.0)

58 (4.5)

1.33 (0.96–1.86)

0.09

Polyp detection rate

522 (41.5)

381 (29.7)

1.4 (1.25–1.55)

<0.001

Although there was no significant difference in the advanced neoplasia detection rate between the groups (4.1% vs. 4.7%; P = 0.39), the SSLDR was significantly higher in the CADe group (6.9% vs. 4.5%; P = 0.001), particularly among high detectors (7.8% vs. 4.5%; P = 0.001). The overall PDR was higher in the CADe group compared with the non-CADe group (47.9% vs. 36.6%; P < 0.001). In terms of polyp morphology, the CADe group showed improved detection rates for protruded (20.5% vs. 16.8%; P = 0.001) and flat lesions (35.7% vs. 26.8%; P < 0.001), whereas the detection of depressed lesions was similar between the groups (0.3% vs. 0.2%; P = 0.77). The number of polypectomies for non-neoplastic lesions per colonoscopy was higher in the CADe group than in the non-CADe group (0.13 [SD 0.43] vs. 0.08 [SD 0.34]; P < 0.001).


Detection rate trends

The trends in ADR and SSLDR over the 3-year study period are shown in Table 1s; Figs. 2s and 3s.

For high detectors, the ADR in the CADe group was significantly higher than in the non-CADe group in 2021 (44.0% vs. 32.0%; P < 0.001). This difference persisted in 2022, although it did not reach statistical significance (42.6% vs. 35.6%; P = 0.07). In 2023, the ADRs were comparable between the two groups (40.4% vs. 42.5%; P = 0.65). The SSLDR for high detectors was significantly higher in the CADe group in both 2021 (8.0% vs. 1.0%; P < 0.001) and 2022 (8.7% vs. 3.2%; P = 0.003), but the difference was not significant in 2023 (4.8% vs. 7.2%; P = 0.37).

For low detectors, the ADR in the CADe group was consistently and significantly higher than in the non-CADe group in 2021 (37.8% vs. 26.8%; P < 0.001) and 2022 (41.9% vs. 27.5%; P < 0.001); however, in 2023, the difference was not significant (28.4% vs. 29.1%; P = 0.93). The SSLDR for low detectors showed a similar trend, with significantly higher rates in the CADe group in 2022 (5.5% vs. 2.4%; P = 0.03), while the differences were not statistically significant in 2021 (5.9% vs. 3.9%; P = 0.21) and 2023 (8.2% vs. 6.2%; P = 0.39).


Per-polyp analysis

The per-lesion analysis is shown in Table 2s. Overall, 4083 lesions were detected (2375 in the CADe group, 1708 in the non-CADe group). The CADe group detected a higher proportion of small (1–5 mm) polyps (79.9% vs. 73.8%; P < 0.001), whereas the non-CADe group detected more medium (6–9 mm) and large (≥10 mm) lesions. The morphological distribution was similar between the groups, except for advanced CRC, which was more frequently detected in the non-CADe group (1.3% vs. 0.3%; P < 0.001). The distribution of lesion locations was comparable between the groups. Pathological analysis revealed similar rates of low grade tubular adenomas and SSLs, whereas the non-CADe group detected more high grade tubular adenomas (4.6% vs. 3.0%; P = 0.009) and invasive cancers (1.8% vs. 0.7%; P = 0.001).


Per-endoscopist analysis

The number and distribution of CADe and non-CADe procedures for each of the 53 endoscopists are summarized in Table 3s. The learning curves for each endoscopist are illustrated by individual CUSUM analyses stratified by detector status and modality, excluding endoscopists with extremely low procedure volumes, in Figs. 4s–7s.



Discussion

Our main finding was that, after CADe implementation, the ADR in procedures performed without CADe was maintained and did not decline, indicating no evidence of endoscopist deskilling. Furthermore, high detectors showed an accelerated learning curve, with sustained improvement in detection skills, while low detectors exhibited only a limited learning effect. To the best of our knowledge, this is the first study to demonstrate that CADe implementation maintained the ADR without a persistent decline in the non-CADe procedures.

This assessment of the long-term impact of AI-assisted procedures on the performance of endoscopists addresses critical gaps in the current literature on CADe in colonoscopy. Recent studies have primarily focused on comparing the ADR between CADe and non-CADe colonoscopies, demonstrating the immediate benefits of CADe use [5] [23] [24]; however, these RCTs were typically conducted over relatively short periods, usually around 1 year, and did not evaluate how the detection abilities of endoscopists evolved during the CADe implementation. Although the concept of AI-induced learning effects has been explored in other medical fields, such as radiology [25] and pathology [26], a quantitative assessment of the learning curves in CADe-assisted colonoscopy has not been previously reported. One particular concern is the potential for "deskilling," where the reliance on AI assistance might lead to the deterioration of independent detection skills. This concern stems from studies in other fields suggesting that AI use may lead to decreased decision-making skills and induce complacency among users [27]. Our study provides empirical evidence through CUSUM analysis that AI assistance can improve the learning curves of endoscopists without deskilling them in adenoma detection when AI is not in use.

Our study revealed that high detectors had a sustained improvement in adenoma detection even without CADe assistance, whereas low detectors showed minimal learning effects despite prolonged CADe exposure. The baseline ADR appears to be a crucial factor in predicting how well endoscopists can learn from AI assistance. In addition, we found that deskilling did not occur in either group, which addresses a major concern about AI implementation in clinical practice. The impact of AI on training and performance has been highlighted as a key research question in colonoscopy [28]. Although CADe can improve detection performance when in use, our results show that it can also contribute to sustained improvement in endoscopic skills, particularly among experienced endoscopists who maintained improved performance even without AI assistance. This suggests that CADe might have a dual role: immediate assistance during procedures and as a tool for long-term skill enhancement.

The CUSUM analysis revealed that high detectors demonstrated similar improvement patterns for their ADR and SSLDR during non-CADe procedures (Fig. 2s), suggesting the successful internalization of detection skills typically enhanced by AI assistance. SSL detection is traditionally challenging because of the subtle endoscopic appearance of SSLs (pale in color and flat, with indistinct margins and the presence of a mucus cap) [29], making the improvement in SSLDR particularly meaningful. The fact that the SSLDR among high detectors reached levels exceeding the 4.0% prevalence rate reported in recent large-scale studies [30] provides strong evidence of successful skill transfer from CADe-assisted to nonassisted procedures.

Although previous studies have focused on experience levels, suggesting that less experienced examiners benefited more from CADe [22] [31], our baseline ADR-based analysis revealed a different pattern. Although low detectors showed no deskilling in terms of adenoma detection, they also showed minimal learning effects despite prolonged CADe exposure. This finding suggests that the baseline ADR, rather than experience alone, may be a better predictor of skill transfer from CADe. A previous study reported that the ADR was higher among endoscopists who described themselves as more compulsive or thorough [32], and these personality traits may inherently exist in high detectors, potentially explaining their superior ability to adapt to and learn from CADe. Future research should focus on understanding why high detectors adapt more effectively to CADe and on developing targeted training programs to help low detectors achieve similar improvements.

A major strength of our study was its pragmatic design, which allowed the impact of CADe to be assessed in a real-world clinical setting. Although RCTs are considered the gold standard for evaluating interventions, they can be susceptible to the Hawthorne effect, where participants modify their behavior simply because they know they are being observed in a study setting [33]. This effect can be particularly pronounced in CADe trials, where endoscopists may perform more carefully than they do in routine practice if they know they are part of a trial [34]. In this trial, the control group did not include the use of sham AI tools, as in a previous trial [3]. We tried to develop a sham AI tool, but we found it technically difficult and endoscopists easily recognized the sham AI, even when its use was blinded. This was particularly evident because our previous study showed that updates to the CADe system significantly reduced false positives [10], making the contrast with the obvious false positives of sham AI even more obvious to endoscopists. Therefore, our pragmatic study design, which observed natural clinical practice over an extended period, provides valuable insights into the real-world learning effects and performance patterns associated with CADe implementation.

The present study had some limitations. First, this was a propensity score-matching prospective nonblinded single-center study; therefore, we cannot eliminate potential selection bias. Second, because we are among the developers of this CADe system, “inventor bias” cannot be avoided. Furthermore, future studies should include a randomized multicenter design focusing on novice endoscopists, ideally led by investigators without conflicts of interest, to more accurately assess the impact of CADe and eliminate potential inventor bias.

Third, although our study demonstrated improved detection abilities with CADe implementation, we cannot completely rule out the possibility that the improvement was due to natural skill enhancement over time. Fourth, detection rate trends (Table 1s; Figs. 2s and 3s) were analyzed as a subanalysis without matching for each year, since matching was only performed for the overall cohort. Therefore, differences in patient backgrounds may have influenced the observed trends. Additionally, although there were no major changes in equipment or protocol, other unmeasured confounders may also have contributed to these findings. The per-polyp analysis may be subject to bias from clustering of lesions within patients.

Fifth, our study did not include standardized feedback or educational interventions during the learning period. In addition, although most endoscopists received a standardized CADe training session, no formal assessment of baseline knowledge or post-training skills was performed. The lack of both structured education and objective evaluation may be a confounder, especially for low detectors. Future studies should include educational components and objective assessments to better understand and optimize CADe-assisted learning.

In conclusion, after CADe implementation, detection rates in procedures performed without CADe were maintained and did not decline. While there was no evidence of endoscopist deskilling, improvement in detection skills was primarily observed among high detectors, with only a limited learning effect in the low detector group.



Conflict of Interest

S. Kudo and M. Misawa have received speaker’s fees from Olympus Corporation and have an ownership interest in the products of Cybernet Systems. T. Nemoto has received a research grant from Olympus Corporation for other studies. K. Mori has received research grants from Cybernet Systems, Morita Mfg, Atom Medical Inc., and Asahi Intecc, and serves as a technical advisor for Sanyu Industry. T. Okumura, Y. Ide, S. Kato, Y. Miyata, K. Takishima, Y. Takashina, Y. Minegishi, M. Abe, T. Sakurai, Y. Kouyama, K. Kato, Y. Maeda, Y. Ogawa, K. Ichimasa, N. Ogata, T. Hayashi, K. Wakamura, T. Baba, H. Miyachi, H. Itoh, and M. Oda declare that they have no conflicts of interest.

Acknowledgement

We wish to acknowledge the other members of the CADe colonoscopy working group: Yu Niimura, Keisuke Sasabe, Yurie Kawabata, Shunto Iwasaki, Tomoya Shibuya, Jiro Kawashima, Shigenori Semba, Takanori Kuroki, Osamu Shiina, Yuriko Morita, Kenichi Mochizuki, and Eri Tamura (all affiliated to the Digestive Disease Center, Showa Medical University Northern Yokohama Hospital, Kanagawa), and Eisuke Inoue (Showa Medical University Research Administration Center, Showa Medical University, Kanagawa, Japan). We are grateful to J. Ludovic Croxford, PhD, from Edanz (https://jp.edanz.com/ac) for editing a draft of this manuscript.


Correspondence

Masashi Misawa, MD, PhD
Digestive Disease Center, Showa Medical University Northern Yokohama Hospital
35-1 Chigasaki-chuo, Tsuzuki, Yokohama
224-8503 Kanagawa
Japan   

Publication History

Received: 08 April 2025

Accepted after revision: 18 July 2025

Accepted Manuscript online:
19 July 2025

Article published online:
05 September 2025

© 2025. Thieme. All rights reserved.

Georg Thieme Verlag KG
Oswald-Hesse-Straße 50, 70469 Stuttgart, Germany


Zoom
Fig. 1 Study flowchart of patient inclusion and propensity score matching.
Zoom
Fig. 2 Cumulative summation (CUSUM) analysis for high detectors based on the adenoma detection rate during computer-aided detection (CADe) and non-CADe procedures (with and without AI, respectively). Performance was judged as follows: if the CUSUM plot fell below the acceptable line (h0), the performance was acceptable; if the CUSUM plot was above the unacceptable line (h1), the performance was unacceptable.
Zoom
Fig. 3 Cumulative summation (CUSUM) analysis for low detectors based on the adenoma detection rate during computer-aided detection (CADe) and non-CADe procedures (with and without AI, respectively). Performance was judged as follows: if the CUSUM plot fell below the acceptable line (h0), the performance was acceptable; if the CUSUM plot was above the unacceptable line (h1), the performance was unacceptable.