Skip to Main Content
 

Global Search Box

 
 
 
 

ETD Abstract Container

Abstract Header

Contrast Pattern Aided Regression and Classification

Taslimitehrani, Vahid

Abstract Details

2016, Doctor of Philosophy (PhD), Wright State University, Computer Science and Engineering PhD.
Regression and classification techniques play an essential role in many data mining tasks and have broad applications. However, most of the state-of-the-art regression and classification techniques are often unable to adequately model the interactions among predictor variables in highly heterogeneous datasets. New techniques that can effectively model such complex and heterogeneous structures are needed to significantly improve prediction accuracy. In this dissertation, we propose a novel type of accurate and interpretable regression and classification models, named as Pattern Aided Regression (PXR) and Pattern Aided Classification (PXC) respectively. Both PXR and PXC rely on identifying regions in the data space where a given baseline model has large modeling errors, characterizing such regions using patterns, and learning specialized models for those regions. Each PXR/PXC model contains several pairs of contrast patterns and local models, where a local classifier is applied only to data instances matching its associated pattern. We also propose a class of classification and regression techniques called Contrast Pattern Aided Regression (CPXR) and Contrast Pattern Aided Classification (CPXC) to build accurate and interpretable PXR and PXC models. We have conducted a set of comprehensive performance studies to evaluate the performance of CPXR and CPXC. The results show that CPXR and CPXC outperform state-of-the-art regression and classification algorithms, often by significant margins. The results also show that CPXR and CPXC are especially effective for heterogeneous and high dimensional datasets. Besides being new types of modeling, PXR and PXC models can also provide insights into data heterogeneity and diverse predictor-response relationships. We have also adapted CPXC to handle classifying imbalanced datasets and introduced a new algorithm called Contrast Pattern Aided Classification for Imbalanced Datasets (CPXCim). In CPXCim, we applied a weighting method to boost minority instances as well as a new filtering method to prune patterns with imbalanced matching datasets. Finally, we applied our techniques on three real applications, two in the healthcare domain and one in the soil mechanic domain. PXR and PXC models are significantly more accurate than other learning algorithms in those three applications.
Guozhu Dong, Ph.D. (Advisor)
Amit Sheth, Ph.D. (Committee Member)
Krishnaprasad Thirunarayan, Ph.D. (Committee Member)
Keke Chen, Ph.D. (Committee Member)
Jyotishman Pathak, Ph.D. (Committee Member)
120 p.

Recommended Citations

Citations

  • Taslimitehrani, V. (2016). Contrast Pattern Aided Regression and Classification [Doctoral dissertation, Wright State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=wright1459377694

    APA Style (7th edition)

  • Taslimitehrani, Vahid. Contrast Pattern Aided Regression and Classification. 2016. Wright State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=wright1459377694.

    MLA Style (8th edition)

  • Taslimitehrani, Vahid. "Contrast Pattern Aided Regression and Classification." Doctoral dissertation, Wright State University, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=wright1459377694

    Chicago Manual of Style (17th edition)