Methods Inf Med 2012; 51(04): 341-347
DOI: 10.3414/ME11-02-0045
Focus Theme – Original Articles
Schattauer GmbH

Supporting Regenerative Medicine by Integrative Dimensionality Reduction

F. Mulas
1   Centre for Tissue Engineering, University of Pavia, Pavia, Italy
,
L. Zagar
2   Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia
,
B. Zupan
2   Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia
1   Centre for Tissue Engineering, University of Pavia, Pavia, Italy
3   Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, USA
,
R. Bellazzi
4   Dipartimento di Ingegneria Industriale e dell’Informazione, Università di Pavia, Pavia, Italy
1   Centre for Tissue Engineering, University of Pavia, Pavia, Italy
› Author Affiliations
Further Information

Publication History

received:08 November 2011

accepted:04 May 2012

Publication Date:
20 January 2018 (online)

Preview

Summary

Objective: The assessment of the developmental potential of stem cells is a crucial step towards their clinical application in regenerative medicine. It has been demonstrated that genome-wide expression profiles can predict the cellular differentiation stage by means of dimensionality reduction methods. Here we show that these techniques can be further strengthened to support decision making with i) a novel strategy for gene selection; ii) methods for combining the evidence from multiple data sets.

Methods: We propose to exploit dimensionality reduction methods for the selection of genes specifically activated in different stages of differentiation. To obtain an integrated predictive model, the expression values of the selected genes from multiple data sets are combined. We investigated distinct approaches that either aggregate data sets or use learning ensembles.

Results: We analyzed the performance of the proposed methods on six publicly available data sets. The selection procedure identified a reduced subset of genes whose expression values gave rise to an accurate stage prediction. The assessment of predictive accuracy demonstrated a high quality of predictions for most of the data integration methods presented.

Conclusion: The experimental results highlighted the main potentials of proposed approaches. These include the ability to predict the true staging by combining multiple training data sets when this could not be inferred from a single data source, and to focus the analysis on a reduced list of genes of similar predictive performance.