Summary
Objectives:
One of the main objectives of microarray analysis is to identify genes differentially
expressed under two distinct experimental conditions. This task is complicated by
the noisiness of data and the large number of genes that are examined. Fold change
(FC) based gene selection often misleads because error variability for each gene is
heterogeneous in different intensity ranges. Several statistical methods have been
suggested, but some of them result in high false positive rates because they make
very strong parametric assumptions.
Methods:
We present support vector quantile regression (SVMQR) using iterative reweighted
least squares (IRWLS) procedure based on the Newton method instead of usual quadratic
programming algorithms. This procedure makes it possible to derive the generalized
approximate cross validation (GACV) method for choosing the parameters which affect
the performance of SVMAR. We propose SVMQR based on a novel method for identifying
differentially expressed genes with a small number of replicated microarrays.
Results:
We applied SVMQR to both three biological dataset and simulated dataset and showed
that it performed more reliably and consistently than FC-based gene selection, Newton’s
method based on the posterior odds of change, or the nonparametric t-test variant
implemented in significance analysis of microarrays (SAM).
Conclusions:
The SVMQR method was an exploratory method for cDNA microarray experiments to identify
genes with different expression levels between two types of samples (e.g., tumor versus
normal tissue). The SVMQR method performed well in the situation where error variability
for each gene was heterogeneous in intensity ranges.
Keywords
cDNA microarray - support vector machine - support vector machine quantile regression