Skip to Main Content
 

Global Search Box

 
 
 
 

ETD Abstract Container

Abstract Header

Global-Local Hybrid Classification Ensembles: Robust Performance with a Reduced Complexity

Abstract Details

2009, Master of Science in Engineering, University of Toledo, Computer Science.

The current trend in machine learning ensemble classifier research is to improve performance, at times marginally, beyond what existing methods can deliver. This tendency has consequently complicated ensemble designs to a level that is possibly not justified for many domains. This thesis proposes a new design for classification ensembles, Global-Local Hybrid Ensemble (GLHE), which offers robust performance with a less complex design than comparably performing ensembles. GLHE exploits two sources of diversity in its base-classifiers, heterogeneous (hybrid) and homogeneous. Heterogeneity is achieved with two learning algorithms – one global and one local – that are assumed to have an intrinsic difference in learning to ensure high levels of diversity. Homogeneity is implemented through the use of multiple parameterizations of the same learning algorithm to allow both global and local learners to explore their respective region of the hypothesis space while also creating additional, albeit small, diversity among the base-classifiers.

A comprehensive simulation study is conducted to profile the performance capabilities of the proposed design, considering three types of classification performance measures, three types of diversity measures, and training/testing execution time as features of analysis. GLHE is implemented with decision tree (global) and nearest-neighbor (local) learners, and its performance on 46 benchmark datasets compared to more than 70 ensembles from the literature and in-house simulations. Specific hypotheses are tested and evaluated with nonparametric statistical significance calculations. First, it is shown that GLHE performs comparable to hybrid ensembles with more learning algorithms (more complexity) and better than data manipulation ensembles. Second, the importance of co-presence of global-local learners and heterogeneous/homogeneous diversity in the GLHE design is validated; along with our assumption the global and local learners produce high levels of diversity. Finally, we create another implementation of GLHE with neural networks, which shows that the design is generic and allows for trade-offs between performance robustness and execution speed. Another experiment compares the performance of GLHE against those achieved by contestants in a data mining competition. Although the contestants likely fine-tuned their algorithms to optimize performance, the standard GLHE implementation still scores no worse than half of them.

The results of the simulation study indicate that GLHE is indeed robust, even in comparison to more complex ensembles. Major contributions of this work are 1) global and local learners can effectively create high levels of diversity, 2) the GLHE design may offer a compromise between the robustness of traditional hybrid ensembles and the simplicity of data manipulation ensembles – an area not satisfied by other ensemble designs, and 3) the GLHE design is a suitable technique for applying to new problems if robust performance is needed but users do not have resources for complex designs or in-depth empirical analysis.

Gursel Serpen (Advisor)
Henry Ledgard (Committee Member)
Han Yu (Committee Member)
228 p.

Recommended Citations

Citations

  • Baumgartner, D. (2009). Global-Local Hybrid Classification Ensembles: Robust Performance with a Reduced Complexity [Master's thesis, University of Toledo]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=toledo1241034194

    APA Style (7th edition)

  • Baumgartner, Dustin. Global-Local Hybrid Classification Ensembles: Robust Performance with a Reduced Complexity. 2009. University of Toledo, Master's thesis. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=toledo1241034194.

    MLA Style (8th edition)

  • Baumgartner, Dustin. "Global-Local Hybrid Classification Ensembles: Robust Performance with a Reduced Complexity." Master's thesis, University of Toledo, 2009. http://rave.ohiolink.edu/etdc/view?acc_num=toledo1241034194

    Chicago Manual of Style (17th edition)