Methods Inf Med 2019; 58(06): 213-221
DOI: 10.1055/s-0040-1702159
Original Article
Georg Thieme Verlag KG Stuttgart · New York

Analysis of Feature Extraction Methods for Prediction of 30-Day Hospital Readmissions

Joel Sumner
1   Department of Mechanical Engineering, The University of Texas at San Antonio, San Antonio, Texas, United States
,
Adel Alaeddini
1   Department of Mechanical Engineering, The University of Texas at San Antonio, San Antonio, Texas, United States
› Author Affiliations
Further Information

Publication History

01 May 2019

31 December 2019

Publication Date:
29 April 2020 (online)

Abstract

Objectives This article aims to determine possible improvements made by feature extraction methods to the machine learning prediction methods for predicting 30-day hospital readmissions.

Methods The study evaluates five feature extraction methods including principal component analysis (PCA), kernel principal component analysis (KPCA), isomap, Laplacian eigenmaps, and locality preserving projections (LPPs) for improving the accuracy of nine machine learning prediction methods in predicting 30-day hospital readmissions. The specific prediction methods considered include logistic regression, Cox regression, linear discriminant analysis, k-nearest neighbor (KNN), support vector machines (SVMs), bagged trees, boosted trees, random forest, and artificial neural networks. All models are developed in MATLAB and validated using area under the curve based on two population-based data sets from partner hospitals.

Results Laplacian eigenmaps and isomap feature extraction provide the most improvement to the readmission predictive accuracy of KNN, SVM, bagged trees, boosted trees, and linear discriminant analysis methods. The results for artificial neural networks, random forest, Cox regression, and logistic regression show improvement for only one of the data sets. Also, PCA and LPP provided the best computation efficiency followed by KPCA, Laplacian eigenmaps, and isomap.

Conclusion Feature extraction methods can improve the predictive performance of machine learning methods for predicting readmissions. However, the improvement depended on the specific choice of the prediction method, feature extraction method, and the complexity of the data set features.

 
  • References

  • 1 Hines AL, Barrett ML, Jiang HJ, Steiner CA. Conditions with the largest number of adult hospital readmissions by payer, 2011. Statistical Brief 2006; 172 (363) 7
  • 2 Jencks SF, Williams MV, Coleman EA. Rehospitalizations among patients in the Medicare fee-for-service program. N Engl J Med 2009; 360 (14) 1418-1428
  • 3 McIlvennan CK, Eapen ZJ, Allen LA. Hospital readmissions reduction program. Circulation 2015; 131 (20) 1796-1803
  • 4 Bhuvan MS, Kumar A, Zafar A, Kishore V. Identifying diabetic patients with high risk of readmission. arXiv preprint arXiv:160204257; 2016
  • 5 Rau J. Medicare fines 2,610 hospitals in third round of readmission penalties. Kaiser Health News; 2014: 2
  • 6 Boccuti C, Casillas G. Aiming for fewer hospital U-turns: the Medicare Hospital Readmission Reduction Program. Policy Brief; 2015
  • 7 Hon CP, Pereira M, Sushmita S, Teredesai A, De Cock M. Risk stratification for hospital readmission of heart failure patients: a machine learning approach. Paper presented at: Proceedings of the 7th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics; 2016
  • 8 Bayati M, Braverman M, Gillam M. , et al. Data-driven decisions for reducing readmissions for heart failure: general methodology and case study. PLoS One 2014; 9 (10) e109264
  • 9 Vedomske MA, Brown DE, Harrison JH. Random forests on ubiquitous data for heart failure 30-day readmissions prediction. Paper presented at: 2013 12th International Conference on Machine Learning and Applications; 2013
  • 10 Salian S, Harisekaran DG. Big data analytics predicting risk of readmissions of diabetic patients. Int J Sci Res (Ahmedabad) 2015; 4 (04) 534-538
  • 11 Silver SA, Harel Z, McArthur E. , et al. 30-day readmissions after an acute kidney injury hospitalization. Am J Med 2017; 130 (02) 163-172.e4
  • 12 Taber DJ, Palanisamy AP, Srinivas TR. , et al. Inclusion of dynamic clinical data improves the predictive performance of a 30-day readmission risk model in kidney transplantation. Transplantation 2015; 99 (02) 324-330
  • 13 Alaeddini A, Helm JE, Shi P, Faruqui SHA. An integrated framework for reducing hospital readmissions using risk trajectories characterization and discharge timing optimization. IISE Trans Healthc Syst Eng 2019; 9 (02) 172-185
  • 14 Helm JE, Alaeddini A, Stauffer JM, Bretthauer KM, Skolarus TA. Reducing hospital readmissions by integrating empirical prediction with resource optimization. Prod Oper Manag 2016; 25 (02) 233-257
  • 15 Lodhi MK, Ansari R, Yao Y, Keenan GM, Wilkie D, Khokhar AA. Predicting hospital re-admissions from nursing care data of hospitalized patients. Paper presented at: Industrial Conference on Data Mining; 2017
  • 16 Zheng B, Zhang J, Yoon SW, Lam SS, Khasawneh M, Poranki S. Predictive modeling of hospital readmissions using metaheuristics and data mining. Expert Syst Appl 2015; 42 (20) 7110-7120
  • 17 Cholleti S, Post A, Gao J. , et al. Leveraging derived data elements in data analytic models for understanding and predicting hospital readmissions. Paper presented at: AMIA Annual Symposium Proceedings; 2012
  • 18 Low LL, Lee KH, Hock Ong ME. , et al. Predicting 30-day readmissions: performance of the LACE index compared with a regression model among general medicine patients in Singapore. BioMed Res Int 2015; 2015: 169870
  • 19 Vinzamuri B, Reddy CK. Cox regression with correlation based regularization for electronic health records. Paper presented at: 2013 IEEE 13th International Conference on Data Mining; 2013
  • 20 Padhukasahasram B, Reddy CK, Li Y, Lanfear DE. Joint impact of clinical and behavioral variables on the risk of unplanned readmission and death after a heart failure hospitalization. PLoS One 2015; 10 (06) e0129553
  • 21 Futoma J, Morris J, Lucas J. A comparison of models for predicting early hospital readmissions. J Biomed Inform 2015; 56: 229-238
  • 22 Chopra C, Sinha S, Jaroli S, Shukla A, Maheshwari S. Recurrent neural networks with non-sequential data to predict hospital readmission of diabetic patients. Paper presented at: Proceedings of the 2017 International Conference on Computational Biology and Bioinformatics; 2017
  • 23 Nguyen P, Tran T, Wickramasinghe N, Venkatesh S. Deepr: a convolutional net for medical records. IEEE J Biomed Health Inform 2017; 21 (01) 22-30
  • 24 Ravì D, Wong C, Deligianni F. , et al. Deep learning for health informatics. IEEE J Biomed Health Inform 2017; 21 (01) 4-21
  • 25 Jamei M, Nisnevich A, Wetchler E, Sudat S, Liu E. Predicting all-cause risk of 30-day hospital readmission using artificial neural networks. PLoS One 2017; 12 (07) e0181173
  • 26 Guyon I, Gunn S, Nikravesh M, Zadeh LA. Feature Extraction: Foundations and Applications. Vol. 207. Berlin Heidelberg: Springer; 2008
  • 27 Dash M, Liu H. Feature selection for classification. Intell Data Anal 1997; 1 (1–4): 131-156
  • 28 Hosmer Jr DW, Lemeshow S, Sturdivant RX. Applied Logistic Regression. Vol. 398. Hobokon, New Jersey: John Wiley & Sons; 2013
  • 29 Menard S. Applied Logistic Regression Analysis. Vol. 106. Thousand Oaks, California: Sage; 2002
  • 30 Wei LJ, Amato DA. Regression analysis for highly stratified failure time observations. University of Wisconsin Clinical Cancer Center, Biostatistics, Department of Human Oncology and Department of Statistics, University of Wisconsin–Madison; 1988
  • 31 Mika S, Ratsch G, Weston J, Scholkopf B, Mullers K-R. Fisher discriminant analysis with kernels. Paper presented at: Neural networks for signal processing IX: Proceedings of the 1999 IEEE signal processing society workshop (cat. no. 98th8468); 1999
  • 32 Kuhkan M. A method to improve the accuracy of k-nearest neighbor algorithm. Int J Comp Engineer Inform Technol 2016; 8 (06) 90
  • 33 Weston J. Support vector machine (and statistical learning theory) tutorial. NEC Labs America; 1998: 4
  • 34 Cortes C, Vapnik V. Support-vector networks. Mach Learn 1995; 20 (03) 273-297
  • 35 Breiman L. Classification and Regression Trees. Boca Raton, Florida: Routledge; 2017
  • 36 Hastie T. Trees Bagging Random Forests and Boosting. Stanford: Stanford University; 2003
  • 37 Stone CJ. Classification and Regression Trees. Wadsworth Int Group 1984; 8: 452-456
  • 38 Fulcher J. Computational intelligence: An Introduction. Computational Intelligence: A Compendium. Berlin Heidelberg: Springer; 2008: 3-78
  • 39 Dunteman GH. Principal Components Analysis. Newbury Park, California: Sage; 1989
  • 40 Osadchy R, Kernel PCA. . Unsupervised Learning; 2011 . Available at: http://www.cs.haifa.ac.il/~rita/uml_course/lectures/KPCA.pdf . Accessed January 20, 2020
  • 41 Scholkopf B, Sung K-K, Burges CJ. , et al. Comparing support vector machines with Gaussian kernels to radial basis function classifiers. IEEE Trans Signal Process 1997; 45 (11) 2758-2765
  • 42 Cayton L. Algorithms for manifold learning. University of California at San Diego Technical Report 2005;12(1–17):1
  • 43 Belkin M, Niyogi P. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput 2003; 15 (06) 1373-1396
  • 44 Cheng J, Liu Q, Lu H, Chen Y-W. Supervised kernel locality preserving projections for face recognition. Neurocomputing 2005; 67: 443-449
  • 45 Wang Z-Q, Wang S-K, Hong T, Wan X-H. A spatial outlier detection algorithm based multi-attributive correlation. Paper presented at: Proceedings of 2004 International Conference on Machine Learning and Cybernetics; 2004