Endoscopy 2023; 55(09): 820-821
Colonoscopy performance measures: going all in?

Referring to Nass KJ et al. p. 812–819
1   Department of Gastroenterology, University Hospital of North Tees, Stockton-on-Tees, UK
2   Population Health Sciences Institute, Newcastle University, Newcastle-upon-Tyne, UK
Delivering high quality care is essential for improving population health outcomes. This is particularly true for colonoscopy, where evidence shows that high quality procedures result in fewer complications and lower incidence and mortality rates for post-colonoscopy colorectal cancer [1].

Measuring quality is an important step in identifying and addressing underperformance to improve health outcomes. Evidence-based performance measures, or quality indicators, help healthcare providers monitor key steps and outcomes to ensure they are delivering high quality, patient-centered care. Effective performance measures should each correlate with an important health outcome. Ideally, there should be a small number of measures, each assessing an important aspect of the service (domain) [2].

“The authors argue that a single measure might create a more holistic approach to colonoscopy quality and I suspect they are right, as endoscopists would know that missing out on even one of the underpinning measures would count against them.”

The endoscopy community might at times feel that it is overwhelmed with performance measures. Not only can measurement be time-consuming, having too many measures is counterproductive as it can diminish the impact of each one. The European Society of Gastrointestinal Endoscopy (ESGE) attempted to address this by developing a small number of colonoscopy performance measures [3] but, in this issue of Endoscopy, Nass et al. go one step further in proposing a single composite performance measure, which they term “textbook process” [4]. This is an inherently attractive idea, but is this progress or just another measure to add into the mix?

In brief, the authors convened a group of 27 European expert endoscopists, who used a Delphi process to determine underpinning measures, based on the ESGE Guideline on Performance measures for lower GI endoscopy, from which an “all or none” composite measure was created. The underpinning measures were: explicit colonoscopy indication; successful cecal intubation; adequate bowel preparation; adequate withdrawal time; acceptable patient comfort score; post-polypectomy surveillance recommendation in line with current guidelines; and absence of use of reversal agents, early complications, adverse events, all-cause readmission within 14 days after the procedure, and all-cause mortality within 30 days after the procedure. Whilst this is not the only composite performance measure for colonoscopy [5], it is the first to have been constructed by international consensus, aiming to cover most colonoscopy domains. Of note, because they constructed this metric at an individual colonoscopy level, they could not include a detection measure such as the mean number of polyps per procedure – more on this later.

The authors then tested their measure retrospectively on data from two Dutch endoscopy services. Overall, their measure was achieved in 72.5 % of colonoscopies, ranging from 41.0 % to 89.1 % at an individual colonoscopist level of measurement. Acceptable patient comfort and an adequate withdrawal time were the underpinning measures that were least commonly achieved.

It is undoubtedly attractive to have a single measure: if a service, or endoscopist, reaches the agreed standard, they can be “green-lighted” and there may be no need to look further, simplifying the quality assurance process. But where the standard is not reached, the measure provides no context, thereby immediately needing to be deconstructed to identify what the underlying issue is.

To maximize the potential of this composite measure, it should work on a traffic light basis, rather than being reported as a percentage. But do we know what good looks like? Further work would be required to determine the range of results for a larger number of endoscopists and services, from which a cutoff could be determined, balancing the incentivizing/demotivating potential of setting that bar too low or high.

The authors argue that a single measure might create a more holistic approach to colonoscopy quality and I suspect they are right, as endoscopists would know that missing out on even one of the underpinning measures would count against them. Conversely, it also ties the endoscopist to the weakest link in the chain and, if this is to be measured for individual endoscopists, as opposed to for the service as a whole, it might be considered unfair as some of the underpinning measures, such as bowel preparation, are beyond their control.

The authors decided to use a methodology that permits each colonoscopy to be awarded a red or green light, albeit they would have to wait post-procedure for the histology to be reported, and any adverse events to present and have been reported on a published dataset. They do not however advocate reporting of this new measure at this level, and I would agree. Given they recommend its use at either an endoscopist or a service level, I feel they have missed an opportunity to correct the biggest deficit of their composite measure – namely, the omission of the most important aspect of colonoscopy performance, a polyp detection measure, in their composite metric. Unfortunately, this means that their composite measure cannot be used alone. Whilst this substantially undermines the benefit of their proposed measure, it would be easy to resolve this omission.

Finally, on to feasibility: could this measure be adopted by most endoscopy services, so becoming the primary benchmarking tool? Sadly, at present I think the answer to that is “no.” This particular measure required linkage with datasets that robustly capture post-procedure complications, hospital admissions, and all-cause mortality. It also required correlation with histology reports to determine the correct post-polypectomy surveillance recommendations – indeed only one of their two test sites could do this. Hence, if it is a matter of “all or none” for benchmarking, only services that can capture all of the data for each underpinning performance measure can be compared. Here, the advantage goes to noncomposite performance measures, as at least some of these can be compared between services. The authors do acknowledge that improvements would be required in structured and standardized reporting.

What is my conclusion? The authors are to be applauded for attempting to create a unifying colonoscopy performance measure. During my reading of the paper, they have changed my mind about the potential benefit of a single headline composite performance measure, particularly where data are held electronically. But I do not think this is the right one, in particular because it omits a polyp detection measure. Perfection is an iterative process, and we now have a platform on which we can develop a better composite measure – couple this with automated capture and calculation of performance measures, as is already available in certain jurisdictions [6], and we might yet streamline colonoscopy quality assurance.

