Endoscopy 2022; 54(10): 1015-1016
DOI: 10.1055/a-1819-6568
Editorial

Targeting the low detector with artificial intelligence

Referring to Troya J et al. p. 1009–1014
Cesare Hassan
1   IRCCS Humanitas Research Hospital, Rozzano, Milan, Italy
2   Department of Biomedical Sciences, Humanitas University, Rozzano, Milan, Italy
,
Alessandro Repici
1   IRCCS Humanitas Research Hospital, Rozzano, Milan, Italy
2   Department of Biomedical Sciences, Humanitas University, Rozzano, Milan, Italy
› Author Affiliations

Colorectal neoplasia was not meant to be detected by human endoscopists. Disappointingly, one in every three polyps is missed by an expert endoscopist. When assuming an average adenoma detection rate (ADR) of 30 %, this indicates that for every patient correctly diagnosed there is one polyp that has been missed. This is further worsened by an additional inter-endoscopist variability exemplified by a threefold difference between high and low detectors, suggesting that low detectors can miss more than two polyps out of three, which is clearly unacceptable from clinical, technical, and ethical standpoints.

“Unfortunately, AI cannot be considered as the definitive answer to low detectors as it may further deteriorate the lack of visual attention, replacing rather than improving the level of performance.”

Unexpectedly, only a few studies have addressed the technical shortcomings of low detectors. Most institutional review boards would refuse any study in which patients were to be intentionally scoped by endoscopists with suboptimal skill. Secondly, the Hawthorne effect would distort the motivation of the low detectors, artificially inflating their performances in a controlled setting. This lack of knowledge is detrimental, preventing any reliable intervention to improve the ADR of low detectors.

There are two main factors that explain why any endoscopist may miss a polyp: a blind spot or a perception error. Both are somewhat expected due to the angulation and folding of colorectal mucosa, and the subtle appearance of colorectal neoplasia, respectively. So, which factor drives the miss rate of low detectors? Is it inadequate exposure of the mucosa behind a fold or the failure to spot a lesion that is on the screen? This is critical, as the two categories of errors require different interventions.

Artificial intelligence (AI) technologies have the advantage of operating independently of the human mind. There is no reason why it should positively or negatively affect the assessment of human performance, and therefore represents an objective referee for each individual endoscopist. In contrast to the human mind, AI is characterized by very narrow tasks, such as the detection of polyps or the rating of the level of bowel preparation or withdrawal speed. Can we use AI to shed light on the specific deficiencies of low detectors? This would help us in better exploiting AI to improve the performance of low detectors.

Two studies in this issue of Endoscopy assessed the techniques of individual endoscopists by using AI as an objective benchmark [1,2]. In particular, the Liu et al. study addressed the issue of mucosal exploration, and the Troya et al. study investigated lesion spotting. Both studies explored whether AI could be used to improve the detection failures of low detectors, promoting a virtuous plausibility for the implementation of AI in clinical practice.

By using AI, Liu at al. objectively measured the quality of fold examination of different endoscopists categorized according to their baseline ADR [1]. Briefly, the AI system measured the proportion of the examination in which the endoscopist achieved a wall view (i. e. exposure of colorectal mucosa to the lens of the scope) compared with a lumen view, which was considered clinically useless for the detection of colorectal neoplasia. The AI system was validated against an expert-based rating of the quality of fold assessment, providing more credibility to its clinical relevance. By categorizing the 11 study endoscopists into low and high detectors (ADR cutoff 25 %), the authors showed a significantly worse quality of fold examination in low detectors compared with high detectors that was, in turn, also associated with a decreased withdrawal time. Thus, these study findings infer two main aspects of low detectors. First, a suboptimal coverage of the colorectal mucosa drives down the performance of the low detector as an unacceptable proportion of the inspection time is uselessly spent in visualizing the lumen of the colon rather than the surrounding mucosal surface. This is suggestively in line with a recent paper, correlating the movement of the colonoscope tip, as assessed by a scope-guide system, with the ADR [3]. Second, the association between withdrawal time and ADR can be explained, at least partially, by the fact that the decrease in withdrawal time is at the expense of adequate visualization of colorectal mucosa rather than speeding up its exploration. In other words, you see less because you scope worse.

The circle of AI for quality of fold examination has been successfully closed by Liu et al. by demonstrating that the assistance of the AI system was able to enhance the performance of low detectors but not of high detectors, underlining how AI can only improve what was defective in the first place, while its relevance for high detectors is likely to be marginal, if any [1].

In the second paper, Troya et al. tracked the eye movement for polyp detection, analyzing its interaction with a computer-aided polyp detection system (CADe) [2]. Novice endoscopists, which may approximate low detectors, showed a trend for a slower reaction time in polyp detection by nearly 1 second compared with experienced endoscopists. In addition, novice endoscopists presented a less focused gaze pattern compared with experienced endoscopists, explaining their slower reaction time. Thus, we can infer that low detectors may miss a lesion that is present on only one or few frames due to a slower reaction time in spotting the lesion, which is in turn related to suboptimal training in eye movement due to lack of expertise or other reasons.

In contrast to the study by Liu et al., the analysis of a possible benefit of AI on such shortcomings of novice endoscopists has been much more disappointing. First, the reaction time of any endoscopist, irrespective of experience or performance, was unaffected by the use of CADe, which, used alone was much faster than any human endoscopist. Thus, a possible training effect of CADe on novice or low detectors is unlikely to occur. In fact, a deskilling effect is much more likely because any endoscopist, and especially the low detectors, will be repeatedly frustrated by the fact that the CADe is persistently faster than the human eye, psychologically impacting endoscopist attention in the long term. This was disappointingly confirmed by the reduction of eye travel distance, potentially indicating that the endoscopist was passively waiting for the next CADe box to appear. This overreliance was also confirmed by a higher rate of misinterpretation of false-positive alerts as true polyps when the endoscopist was assisted by CADe compared with unassisted colonoscopy.

When amalgamating the findings of the two studies, two main deficiencies of the low detector have been observed, namely suboptimal coverage of the colorectal mucosa and slower detection of visible lesions due to an inadequately trained gaze pattern. Sadly, the separate outcomes of the two studies are likely to combine in real life, suggesting that low detectors are less likely to spot the fewer polyps that are detectable in incompletely exposed mucosa during a possibly decreased and anyway less effective withdrawal time. Unfortunately, AI cannot be considered as the definitive answer to low detectors as it may further deteriorate the lack of visual attention, replacing rather than improving the level of performance.

AI may have a role in quality assurance by objectively assessing the skills of any endoscopist before or irrespective of ADR measurement. This should be recommended when certifying endoscopists for participation in population-based screening interventions, leading to better standardization of endoscopist performance by early rather delayed identification of those endoscopists who are at risk of being low detectors.



Publication History

Article published online:
20 May 2022

© 2022. Thieme. All rights reserved.

Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany