Analysis of Machine Learning Algorithms for Diagnosis of Diffuse Lung DiseasesFunding This work was partially funded by Fapeal, CNPq, and SEFAZ-AL. The work of Héctor Allende-Cid was supported by the project FONDECYT Initiation into Research 11150248.
05 December 2017
26 May 2018
15 March 2019 (online)
Background Diffuse lung diseases (DLDs) are a diverse group of pulmonary disorders, characterized by inflammation of lung tissue, which may lead to permanent loss of the ability to breathe and death. Distinguishing among these diseases is challenging to physicians due their wide variety and unknown causes. Computer-aided diagnosis (CAD) is a useful approach to improve diagnostic accuracy, by combining information provided by experts with Machine Learning (ML) methods.
Objectives Exploring the potential of dimensionality reduction combined with ML methods for diagnosis of DLDs; improving the classification accuracy over state-of-the-art methods.
Methods A data set composed of 3252 regions of interest (ROIs) was used, from which 28 features were extracted per ROI. We used Principal Component Analysis, Linear Discriminant Analysis, and Stepwise Selection – Forward, Backward, and Forward-Backward to reduce feature dimensionality. The feature subsets obtained were used as input to the following ML methods: Support Vector Machine, Gaussian Mixture Model, k-Nearest Neighbor, and Deep Feedforward Neural Network. We also applied a Deep Convolutional Neural Network directly to the ROIs.
Results We achieved the maximum reduction from 28 to 5 dimensions using LDA. The best classification results were obtained by DFNN, with 99.60% of overall accuracy.
Conclusions This work contributes to the analysis and selection of features that can efficiently characterize the DLDs studied.
- 1 Raghu G, Chen SY, Yeh WS, , Maroni B2, Li Q, Lee YC, Collard HR. Idiopathic pulmonary fibrosis in US medicare beneficiaries aged 65 years and older: incidence, prevalence, and survival, 2001–2011. Lancet Respir Med 2014; 2 (07) 566-572
- 2 Pereyra L, Rangayyan R, Ponciano-Silva M, Azevedo-Marques P. Fractal analysis for computer-aided diagnosis of diffuse pulmonary diseases in HRCT images. In: 2014 IEEE International Symposium on Medical Measurements and Applications (MeMeA); 2014. p. 1–5
- 3 Almeida E, Rangayyan R, Azevedo-Marques P. Gaussian mixture modeling for statistical analysis of features of high-resolution CT images of diffuse pulmonary diseases. In: 2015 IEEE International Symposium on Medical Measurements and Applications (MeMeA); 2015. p. 1–5
- 4 Kauczor H, Heitmann K, Heussel C, Marwede D, Uthmann T, Thelen M. Automatic detection and quantification of ground-glass opacities on high-resolution CT using multiple neural networks: comparison with a density mask. AJR Am J Roentgenol 2000; 175 (05) 1329-1334
- 5 Uchiyama Y, Katsuragawa S, Abe H, Shiraishi J, Li F, Li Q, Zhang CT, Suzuki K, Doi K. Quantitative computerized analysis of diffuse lung disease in high-resolution computed tomography. Med Phys 2003; 30 (09) 2440-2454
- 6 Goodfellow I, Bengio Y, Courville A. Deep Learning. MIT Press; 2016
- 7 Anthimopoulos M, Christodoulidis S, Ebner L, Christe A, Mougiakakou S. Lung pattern classification for interstitial lung diseases using a deep convolutional neural network. IEEE Trans Med Imag 2016; 5: 1207-1216
- 8 Christodoulidis S, Anthimopoulos M, Ebner L, Christe A, Mougiakakou S. Multisource transfer learning with convolutional neural networks for lung pattern analysis. IEEE Journal of Biomedical and Health Informatics 2017; 21 (01) 76-84
- 9 Shin H, Roth H, Gao M, Lu L, Xu Z, Nogues I, Yao J, Mollura D, Summers R. Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans Med Imag 2016; 35 (05) 1285-1298
- 10 Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L. ImageNet: A Large-Scale Hierarchical Image Database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition; 2009
- 11 Hashimoto N, Suzuki K, Liu J, Hirano Y, Macahon H, Kido S. Deep neural network convolution (NNC) for three-class classification of diffuse lung disease opacities in high-resolution CT (HRCT): consolidation, ground-glass opacity (GGO), and normal opacity. Proc. SPIE 10575, Medical Imaging 2018: Computer-Aided Diagnosis, 1057536 (27 February 2018)
- 12 Haralick RM, Shanmugam K, Dinstein I. Textural features for image classification. IEEE Trans Syst, Man, Cybern Syst 1973; 3 (06) 610-621
- 13 Laws KI. Rapid texture identification. Proc. SPIE 0238, Image Processing for Missile Guidance, (23 December 1980)
- 14 Banik S, Rangayyan RM, Desautels J. Detection of architectural distortion in prior mammograms. IEEE Trans Med Imag 2011; 30 (02) 279-294
- 15 Rangayyan RM. Biomedical Image Analysis. CRC Press; 2004
- 16 Tan P, Steinbach M, Kumar V. Introduction to Data Mining. Boston, MA: Addison-Wesley Longman Publishing Co., Inc.; 2005
- 17 Zaki MJ, Meira Jr. W. Data Mining and Analysis: Fundamental Concepts and Algorithms. Cambridge University Press; 2014
- 18 James G, Witten D, Hastie T, Tibshirani R. An Introduction to Statistical Learning: with Applications in R. Springer Texts in Statistics. Springer; 2013
- 19 Fraley C, Raftery A. Model-based clustering, discriminant analysis, and density estimation. J Am Stat Assoc 2000; 97: 611-631
- 20 Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. CoRR 2014 Doi: abs/1409.1556: 1409.1556
- 21 Zeiler MD. ADADELTA: An adaptive learning rate method. CoRR 2012 Doi: abs/1212.5701: 1212.5701
- 22 Ruder S. An overview of gradient descent optimization algorithms. CoRR 2016 Doi: abs/1609.04747: 1609.04747
- 23 Almeida E, Rangayyan R, Azevedo-Marques P. Fuzzy membership functions for analysis of high-resolution CT images of diffuse pulmonary diseases. In: 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society; 2015. p. 719–722