Evaluating Strategies for Marker Ranking in Genome-wide Association Studies of Complex Traits

A. Scherag; J. Hebebrand; H.-E. Wichmann; K.-H. Jöckel

doi:10.3414/ME09-02-0055

Subscribe to RSS

Please copy the URL and add it into your RSS Feed Reader.

https://www.thieme-connect.de/rss/thieme/en/10.1055-s-00035037.xml

Download PDF

Methods Inf Med 2010; 49(06): 632-640
DOI: 10.3414/ME09-02-0055

Special Topic – Original Articles

Schattauer GmbH

Evaluating Strategies for Marker Ranking in Genome-wide Association Studies of Complex Traits

Authors

A. Scherag

¹Institute for Medical Informatics, Biometry and Epidemiology, University Hospital of Essen, University Duisburg-Essen, Essen, Germany
J. Hebebrand

²Department of Child and Adolescent Psychiatry, University of Duisburg-Essen, Essen, Germany
H.-E. Wichmann

³Helmholtz Zentrum München, German Research Center for Environmental Health, Institute of Epidemiology, Neuherberg, Germany

⁴Ludwig-Maximilians University Munich, Institute of Medical Data Management, Biometrics and Epidemiology, Chair of Epidemiology, Munich, Germany
K.-H. Jöckel

¹Institute for Medical Informatics, Biometry and Epidemiology, University Hospital of Essen, University Duisburg-Essen, Essen, Germany

Further Information

Publication History

received: 09 December 2009

accepted: 24 February 2010

Publication Date:
18 January 2018 (online)

Permissions and Reprints

Summary

Background: Genome-wide association studies (GWAS) were highly successful in identifying new susceptibility loci of complex traits. Such studies usually start with genotyping fixed arrays of genetic markers in an initial sample. Out of these markers, some are selected which will be further genotyped in independentsamples. Due tothevery low a priori probability of a true positive association, the vast majority of all marker signals will turn out to be false positive. Thus, several methods to sort marker data have been proposed which will be evaluated here.

Objectives: We compared statistical properties of ranking by p-values, q-values, the False Positive Report Probability (FPRP) and the Bayesian False-Discovery Probability (BFDP).

Methods: We performed simulation studies for a genomic region derived from GWAS data sets and calculated descriptive statistics as well as mean square errors with regard to the true marker ranking. Additionally, we applied all measures to a GWAS for early onset extreme obesity superimposing a priori information on candidate genes.

Results: Despite the known, more extreme probability results for traditional p-values, we observed that both p-values and the BFDP were more precise in reconstructing the “true” order of the markers in a region. In addition, the BFDP was useful to attenuate unexpected effects at a genome-wide scale.

Conclusions: For the purpose of selecting markers from an initial GWAS and within the limits of this study, we recommend either ranking by p-values or the application of a full Bayesian approach for which the BFDP is a first approximation.

Keywords

Genome-wide association study - p-value - q-value - FPRP - BFDP

References
1 Altshuler D, Daly MJ, Lander ES. Genetic Mapping in Human Disease. Science 2008; 322 5903 881-888.

Reference Link Ris
Crossref PubMed Search in Google Scholar
2 Hirschhorn JN. Genomewide association studies – illuminating biologic pathways. N Engl J Med 2009; 360 (17) 1699-1701.

Reference Link Ris
Crossref PubMed Search in Google Scholar
3 Frayling TM. Genome-wide association studies provide new insights into type 2 diabetes aetiology. Nat Rev Genet 2007; 8 (Suppl. 09) 657-662.

Reference Link Ris
Crossref PubMed Search in Google Scholar
4 Storey JD. A direct approach to false discovery rates. Journal of the Royal Statistical Society Series B-Statistical Methodology 2002; 64: 479-498.

Reference Link Ris
PubMed Search in Google Scholar
5 Storey JD. The positive false discovery rate: A Bayesian interpretation and the q-value. Annals of Statistics 2003; 31 (Suppl. 06) 2013-2035.

Reference Link Ris
Crossref PubMed Search in Google Scholar
6 Storey JD, Tibshirani R. Statistical significance for genomewide studies. Proceedings of the National Academy of Sciences of the United States of America 2003; 100 (16) 9440-9445.

Reference Link Ris
Crossref PubMed Search in Google Scholar
7 Storey JD. The optimal discovery procedure: a new approach to simultaneous significance testing. Journal of the Royal Statistical Society Series B – Statistical Methodology 2007; 69: 347-368.

Reference Link Ris
PubMed Search in Google Scholar
8 Storey JD, Taylor JE, Siegmund D. Strong Control, Conservative Point Estimation and Simultaneous Conservative Consistency of False Discovery Rates: A Unified Approach. Journal of the Royal Statistical Society Series B – Methodological 2004; 66 (Suppl. 01) 187-205.

Reference Link Ris
PubMed Search in Google Scholar
9 Wacholder S, Chanock S, Garcia-Closas M, Katki HA, El Ghormli L, Rothman N. Re: Assessing the probability that a positive report is false: An approach for molecular epidemiology studies – Response. Journal of the National Cancer Institute 2004; 96 (22) 1722-1723.

Reference Link Ris
PubMed Search in Google Scholar
10 Wacholder S, Chanock S, Garcia-Closas M, El Ghormli L, Rothman N. Assessing the probability that a positive report is false: An approach for molecular epidemiology studies. Journal of the National Cancer Institute 2004; 96 (Suppl. 06) 434-442.

Reference Link Ris
Crossref PubMed Search in Google Scholar
11 Wakefield J. A Bayesian measure of the probability of false discovery in genetic epidemiology studies. Am J Hum Genet 2007; 81 (Suppl. 02) 208-227 Erratum in: Am J Hum Genet 2008, 83 ,3:424.

Reference Link Ris
Crossref PubMed Search in Google Scholar
12 Wakefield J. Reporting and interpretation in genome-wide association studies. International Journal of Epidemiology 2008; 37 (Suppl. 03) 641-653.

Reference Link Ris
Crossref PubMed Search in Google Scholar
13 Wakefield J. Bayes factors for genome-wide association studies: comparison with P-values. Genet Epidemiol 2009; 33 (Suppl. 01) 79-86.

Reference Link Ris
Crossref PubMed Search in Google Scholar
14 Benjamini Y, Hochberg Y. Controlling the False Discovery Rate – A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society Series B – Methodological 1995; 57 (Suppl. 01) 289-300.

Reference Link Ris
PubMed Search in Google Scholar
15 Freidlin B, Zheng G, Li Z, Gastwirth JL. Trend tests for case-control studies of genetic markers: power, sample size and robustness. Hum Hered 2002; 53 (Suppl. 03) 146-152.

Reference Link Ris
Crossref PubMed Search in Google Scholar
16 Slager SL, Schaid DJ. Case-control studies of genetic markers: power and sample size approximations for Armitage’s test for trend. Hum Hered 2001; 52 (Suppl. 03) 149-153.

Reference Link Ris
Crossref PubMed Search in Google Scholar
17 Scherag A, Dina C, Hinney A, Vatin V, Scherag S, Vogel CI, Müller TD, Grallert H, Wichmann HE, Balkau B, Heude B, Jarvelin MR, Hartikainen AL, Levy-Marchal C, Weill J, Delplanque J, Körner A, Kiess W, Kovacs P, Rayner NW, Prokopenko I, McCarthy MI, Schäfer H, Jarick I, Boeing H, Fisher E, Reinehr T, Heinrich J, Rzehak P, Berdel D, Borte M, Biebermann H, Krude H, Rosskopf D, Rimmbach C, Rief W, Fromme T, Klingenspor M, Schür-mann A, Schulz N, Nöthen MM, Mühleisen TW, Erbel R, Jöckel KH, Moebus S, Boes T, Illig T, Froguel P, Hebebrand J, Meyre D. Two new loci for body-weight regulation identified in a joint analysis of genome-wide association studies for early-onset extreme obesity in French and German study groups. PLoS Genet 2010; 6 (Suppl. 04) e1000916.

Reference Link Ris
Crossref PubMed Search in Google Scholar
18 Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 2007; 81 (Suppl. 03) 559-575.

Reference Link Ris
Crossref PubMed Search in Google Scholar
19 Bland JM, Altman DG. Measuring agreement in method comparison studies. Statistical Methods in Medical Research 1999; 8 (Suppl. 02) 135-160.

Reference Link Ris
Crossref PubMed Search in Google Scholar
20 Hoggart CJ, Clark TG, De IM, Whittaker JC, Balding DJ. Genome-wide significance for dense SNP and resequencing data. Genet Epidemiol 2008; 32 (Suppl. 02) 179-185.

Reference Link Ris
Crossref PubMed Search in Google Scholar
21 Repsilber D, Mansmann U, Brunner E, Ziegler A. Tutorial on microarray gene expression experiments. An introduction. Methods Inf Med 2005; 44 (Suppl. 03) 392-399.

Reference Link Ris
Thieme Connect PubMed Search in Google Scholar
22 Lucke JF. A critique of the false-positive report probability. Genet Epidemiol 2009; 33 (Suppl. 02) 145-150.

Reference Link Ris
Crossref PubMed Search in Google Scholar
23 Stephens M, Balding DJ. Bayesian statistical methods for genetic association studies. Nat Rev Genet 2009; 10 (10) 681-690.

Reference Link Ris
Crossref PubMed Search in Google Scholar
24 Wichmann HE. Genetic epidemiology in Germany – from biobanking to genetic statistics. Methods Inf Med 2005; 44 (Suppl. 04) 584-589.

Reference Link Ris
Thieme Connect PubMed Search in Google Scholar
25 Pahl R, Schäfer H, Müller HH. Optimal multistage designs – a general framework for efficient genome-wide association studies. Biostatistics 2009; 10 (Suppl. 02) 297-309.

Reference Link Ris
Crossref PubMed Search in Google Scholar
26 Scherag A, Hebebrand J, Schäfer H, Müller HH. Flexible designs for genomewide association studies. Biometrics 2009; 65 (Suppl. 03) 815-821.

Reference Link Ris
Crossref PubMed Search in Google Scholar

Related Journals

Subscribe to RSS

Share / Bookmark

Evaluating Strategies for Marker Ranking in Genome-wide Association Studies of Complex Traits

Authors

Publication History

Summary

Keywords

References