Adjusting for multiplicity in diagnostic studies: Approaches for obtaining simultaneous confidence intervals for sensitivity and specificity

A Rudolph; A Zapf

doi:10.1055/s-0037-1605822

Subscribe to RSS

Please copy the URL and add it into your RSS Feed Reader.

https://www.thieme-connect.de/rss/thieme/en/10.1055-s-00000022.xml

Share / Bookmark

Facebook Linkedin Weibo

Gesundheitswesen 2017; 79(08/09): 656-804
DOI: 10.1055/s-0037-1605822

Vorträge

Georg Thieme Verlag KG Stuttgart · New York

Adjusting for multiplicity in diagnostic studies: Approaches for obtaining simultaneous confidence intervals for sensitivity and specificity

A Rudolph

¹QuintilesIMS, Real World Insights, Frankfurt am Main

,

A Zapf

²Universitätsmedizin Göttingen, Institut für Medizinische Statistik, Göttingen

› Author Affiliations

Further Information

Publication History

Publication Date:
01 September 2017 (online)

Also available at

Congress Abstract
Full Text

Objective:

The simultaneous assessment of multiple procedures as potential diagnostic tests is common in early diagnostic trials. Whether a procedure is considered a useful diagnostic test is generally judged based on its sensitivity and specificity. While assessing several procedures at once in a trial may be efficient, this introduces also multiple testing. To keep the desired type-I-error rate, the multiplicity of tests makes it necessary to adjust the confidence intervals. Therefore, the primary aim of this study was to investigate various methods to construct simultaneous confidence intervals for sensitivity and specificity.

Methods:

As non-normal and ordinal data are common in diagnostic trials, nonparametric methods were considered. In particular, the following approaches were evaluated:

Bonferroni corrected confidence intervals, with and without Logit transformation;
Asymptotic nonparametric simultaneous confidence intervals for relative treatment effects, which correspond to multiple contrast tests;
Simultaneous confidence intervals based on multiple contrasts test using a Logit transformation;
Wild bootstrap based simultaneous confidence intervals.

The empirical type-I-errors of the different approaches were investigated using a simulation study and all methods were also applied using real data.

Results:

While most of the methods listed above resulted in very liberal decisions, the confidence intervals based on multiple contrasts test using a Logit transformation resulted in empirical type-I-errors closest to the desired level of 2.5%. However, this method also led to conservative decisions for sensitivities and specificities larger than 90%.

Conclusion:

It is worthwhile to conduct further studies investigating additional alternative methods to construct simultaneous confidence intervals for sensitivity and specificity with potentially better properties.