Gesundheitswesen 2017; 79(08/09): 656-804
DOI: 10.1055/s-0037-1605822
Vorträge
Georg Thieme Verlag KG Stuttgart · New York

Adjusting for multiplicity in diagnostic studies: Approaches for obtaining simultaneous confidence intervals for sensitivity and specificity

A Rudolph
1   QuintilesIMS, Real World Insights, Frankfurt am Main
,
A Zapf
2   Universitätsmedizin Göttingen, Institut für Medizinische Statistik, Göttingen
› Author Affiliations
Further Information

Publication History

Publication Date:
01 September 2017 (online)

 

Objective:

The simultaneous assessment of multiple procedures as potential diagnostic tests is common in early diagnostic trials. Whether a procedure is considered a useful diagnostic test is generally judged based on its sensitivity and specificity. While assessing several procedures at once in a trial may be efficient, this introduces also multiple testing. To keep the desired type-I-error rate, the multiplicity of tests makes it necessary to adjust the confidence intervals. Therefore, the primary aim of this study was to investigate various methods to construct simultaneous confidence intervals for sensitivity and specificity.

Methods:

As non-normal and ordinal data are common in diagnostic trials, nonparametric methods were considered. In particular, the following approaches were evaluated:

  • Bonferroni corrected confidence intervals, with and without Logit transformation;

  • Asymptotic nonparametric simultaneous confidence intervals for relative treatment effects, which correspond to multiple contrast tests;

  • Simultaneous confidence intervals based on multiple contrasts test using a Logit transformation;

  • Wild bootstrap based simultaneous confidence intervals.

The empirical type-I-errors of the different approaches were investigated using a simulation study and all methods were also applied using real data.

Results:

While most of the methods listed above resulted in very liberal decisions, the confidence intervals based on multiple contrasts test using a Logit transformation resulted in empirical type-I-errors closest to the desired level of 2.5%. However, this method also led to conservative decisions for sensitivities and specificities larger than 90%.

Conclusion:

It is worthwhile to conduct further studies investigating additional alternative methods to construct simultaneous confidence intervals for sensitivity and specificity with potentially better properties.