Methods Inf Med 2007; 46(05): 538-541
DOI: 10.1160/ME0397
Paper
Schattauer GmbH

Pseudo-precision in Gene Expression Values Can Reduce Efficiency

M. Neuhäuser
1   Institute for Medical Informatics, Biometry and Epidemiology, University Hospital Essen, Essen, Germany
,
T. Boes
1   Institute for Medical Informatics, Biometry and Epidemiology, University Hospital Essen, Essen, Germany
,
K.-H. Jöckel
1   Institute for Medical Informatics, Biometry and Epidemiology, University Hospital Essen, Essen, Germany
› Author Affiliations
Further Information

Publication History

Publication Date:
22 January 2018 (online)

Summary

Objectives: When estimating the expression of genes based on the scanned images from microarrays various algorithms are applied in a so-called low-level analysis which can calculate expression values with an arbitrary number of digits beyond the decimal point. However, too many digits (decimal places) are usually not justified because they do not represent the precision of the measured expression. Thus, there is pseudo-precision and, as a result, there are no tied values.

Methods: We suggest avoiding, or omitting, the pseudo-precision: ties can remain, or be created by rounding the computed expression values. Then, average ranks can be used in order to apply nonparametric tests when ties occur. We use two actual data sets and the Wilcoxon rank sum test.

Results: We demonstrate that rounding gives a more efficient test, i.e. the average p-value is decreased and the number of p-values smaller than 0.05 is increased.

Conclusions: The random noise of pseudo-precision can reduce the efficiency of statistical tests applied to detect differentially expressed genes. This result is, obviously, relevant in many other areas of our digitalized world.

 
  • References

  • 1 Repsilber D, Mansmann U, Brunner E, Ziegler A. Tutorial on microarray gene expression experiments. Methods Inf Med 2005; 44: 392-9.
  • 2 Wit E, McClure J. Statistics for microarray: design, analysis, and inference.. Chichester: Wiley; 2004
  • 3 Boes T, Neuhäuser M. Normalization for Affymetrix GeneChips. Methods Inf Med 2005; 44: 414-7.
  • 4 Ittrich C. Normalization for two-channel micro-array data. Methods Inf Med 2005; 44: 418-22.
  • 5 Dugas M. et al. A generic concept for large-scale microarray analysis dedicated to medical diagnostics. Methods Inf Med 2006; 45: 146-52.
  • 6 Suárez-Fariñas M, Haider A, Wittkowski KM. “Harshlighting” small blemishes on microarrays. BMC Bioinformatics 2005; 6: 65.
  • 7 Repsilber D. et al. Sample selection for micro-array gene expression studies. Methods Inf Med 2005; 44: 461-7.
  • 8 Rahnenführer J. Image analysis for cDNA microarrays. Methods Inf Med 2005; 44: 405-7.
  • 9 Draghici S. et al. Reliability and reproducibility issues in DNA microarray measurements. Trends in Genetics 2006; 22: 101-9.
  • 10 Neuhäuser M, Senske R. The Baumgartner-WeißSchindler test for the detection of differentially expressed genes in replicated microarray experiments. Bioinformatics 2004; 20: 3553-64.
  • 11 Putter J. The treatment of ties in some nonparametric tests. Annals of Mathematical Statistics 1955; 26: 368-86.
  • 12 Lehmacher W. Asymptotische Eigenschaften linearer Zweistichproben-Rangtests bei beliebigen Verteilungen. PhD thesis. Department of Statistics, University of Dortmund; 1976
  • 13 Tilquin P. et al. Non-parametric interval mapping in half-sib designs: use of midranks to account for ties. Genetical Research 2003; 81: 221-8.
  • 14 Gadbury GL. et al. Randomization tests for small samples: an application for genetic expression data. Applied Statistics 2003; 52: 365-76.
  • 15 Hollander M, Wolfe DA. Nonparametric statistical methods.. New York: Wiley; (2nd edition) 1999
  • 16 Manly BFJ. Randomization, bootstrap and Monte Carlo methods in biology.. London: Chapman & Hall; (2nd edition) 1997
  • 17 Good PI. Permutation tests.. New York: Springer; (2nd edition) 2000
  • 18 Brunner E, Munzel U. Nichtparametrische Datenanalyse.. Berlin: Springer; 2002
  • 19 Irizarry RA. et al. Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Research 2003; 31: e15.
  • 20 Irizarry RA, Hobbs FCB, Beaxer-Barclay Y, Antonellis K, Scherf U, Speed TP. Exploration, normalization, and summaries of high density oligonucleotide arrayprobe level data. Biostatistics 2003; 4: 249-64.
  • 21 Giles PJ, Kipling D. Normality of oligonucleotide microarray data and implications for parametric statistical analyses. Bioinformatics 2003; 19: 2254-62.
  • 22 Tschentscher F. et al. Tumor classification based on gene expression profiling shows that uveal melanomas with and without monosomy 3 represent two distinct entities. Cancer Research 2003; 63: 2578-84.
  • 23 Alam I. et al. Comparative homology agreement search: An effective combination of homology-search methods. Proccedings of the National Academy of Sciences USA 2004; 101: 13814-9.