Skip to Main Content
 

Global Search Box

 
 
 
 

Files

ETD Abstract Container

Abstract Header

Improved Feature-Selection for Classification Problems using Multiple Auto-Encoders

Abstract Details

2018, PhD, University of Cincinnati, Engineering and Applied Science: Computer Science and Engineering.
Complex and high-dimensional data such as medical images, sensor measurements, and sounds is often limited. In machine learning, using such datasets to directly train classification algorithms can lead to unsatisfactory accuracy. Feature learning refers to a set of data dimension reduction approaches for transforming high-dimensional data to a low-dimensional representational space, thus making classification more tractable. The research in this dissertation presents a novel feature selection method based on harvesting diverse features from a feature pool generated via a population of independently trained sparse auto-encoders (SAEs). The study is based on the hypotheses that: a) A set of features selected from the pool can provide a better representation of the original data than a single SAE; b) Diversity between features is an appropriate way of selecting features from the pool; and c) Using SAEs with very narrow hidden layers – called pinched SAEs (pSAEs) – to generate the feature pool increases the efficiency of the process with no cost in performance. The method is validated on two classification tasks: handwritten digit recognition on MNIST data, and object recognition on CIFAR-10 data. Several diversity metrics and feature selection heuristics are compared to determine which one to use in the final algorithm. To demonstrate the generalization of the method to real-world problems, a slightly modified version of it was applied in an Autism Spectrum Disorder (ASD) diagnosis task in which a real, complex neuroimaging dataset was used. The feature selection method was integrated into a deep neural network (DNN)–based classification model for classifying ASD patients and typical development (TD) controls based on restingstate functional magnetic resonance imaging (rs-fMRI) images. Results demonstrate that the feature selection method can improve the classification performance by as much as 9.09% compared to the model without the feature selection, and achieves performance that is impressive compared to other studies. In addition, a Fisher’s score-based biomarker identification method based on the DNN is developed, and used to identify 32 functional connectivity (FC) features related to ASD. The biological meaning of these findings is discussed in detail.
Ali Minai, Ph.D. (Committee Chair)
Raj Bhatnagar, Ph.D. (Committee Member)
Yizong Cheng, Ph.D. (Committee Member)
Long Lu, Ph.D. (Committee Member)
Carla Purdy, Ph.D. (Committee Member)
107 p.

Recommended Citations

Citations

  • Guo, X. (2018). Improved Feature-Selection for Classification Problems using Multiple Auto-Encoders [Doctoral dissertation, University of Cincinnati]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1522420335154157

    APA Style (7th edition)

  • Guo, Xinyu. Improved Feature-Selection for Classification Problems using Multiple Auto-Encoders. 2018. University of Cincinnati, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=ucin1522420335154157.

    MLA Style (8th edition)

  • Guo, Xinyu. "Improved Feature-Selection for Classification Problems using Multiple Auto-Encoders." Doctoral dissertation, University of Cincinnati, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1522420335154157

    Chicago Manual of Style (17th edition)