Skip to Main Content
 

Global Search Box

 
 
 
 

ETD Abstract Container

Abstract Header

Statistical learning and predictive modeling in data mining

Abstract Details

2006, Doctor of Philosophy, Ohio State University, Statistics.
This research effort focuses on Bayesian robustness properties of regularized optimization methods and developing a hybrid predictive modeling strategy that emphasizes model interpretation. It is known that many regularized optimization methods have Bayesian interpretation. In the first part of the thesis, we consider a class of flat-tailed priors for a general likelihood function in the same spirit as the 't-distribution suggested as a flat-tail prior for normal likelihood'. We formalize the robustness property in terms of the relative tail behaviors of the likelihood and the priors. Using this setup, we examine the robustness properties for bridge regression family and group LASSO, as well as the consistency issue for the LASSO solution. In the second part, we suggest a two-phase boosting method, called "additive regression tree and smoothing splines" (ARTSS), which is highly competitive in predictive performance. However, unlike many automated learning procedures, which lack interpretability and operate as a "black box", ARTSS allows us to (1) estimate the marginal effect smoothly; (2) test the significance of non-additive effects; (3) provide a measure of relative variable importance on main effects and interactions; (4) select variables and/or incorporating hierarchical structure in modeling. Finally, we apply ARTSS to two large public domain data sets and discuss the understanding developed from the model.
Prem Goel (Advisor)

Recommended Citations

Citations

  • Li, B. (2006). Statistical learning and predictive modeling in data mining [Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1155058111

    APA Style (7th edition)

  • Li, Bin. Statistical learning and predictive modeling in data mining. 2006. Ohio State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu1155058111.

    MLA Style (8th edition)

  • Li, Bin. "Statistical learning and predictive modeling in data mining." Doctoral dissertation, Ohio State University, 2006. http://rave.ohiolink.edu/etdc/view?acc_num=osu1155058111

    Chicago Manual of Style (17th edition)