Skip to Main Content
Frequently Asked Questions
Submit an ETD
Global Search Box
Need Help?
Keyword Search
Participating Institutions
Advanced Search
School Logo
Files
File List
osu1155058111.pdf (906.7 KB)
ETD Abstract Container
Abstract Header
Statistical learning and predictive modeling in data mining
Author Info
Li, Bin
Permalink:
http://rave.ohiolink.edu/etdc/view?acc_num=osu1155058111
Abstract Details
Year and Degree
2006, Doctor of Philosophy, Ohio State University, Statistics.
Abstract
This research effort focuses on Bayesian robustness properties of regularized optimization methods and developing a hybrid predictive modeling strategy that emphasizes model interpretation. It is known that many regularized optimization methods have Bayesian interpretation. In the first part of the thesis, we consider a class of flat-tailed priors for a general likelihood function in the same spirit as the 't-distribution suggested as a flat-tail prior for normal likelihood'. We formalize the robustness property in terms of the relative tail behaviors of the likelihood and the priors. Using this setup, we examine the robustness properties for bridge regression family and group LASSO, as well as the consistency issue for the LASSO solution. In the second part, we suggest a two-phase boosting method, called "additive regression tree and smoothing splines" (ARTSS), which is highly competitive in predictive performance. However, unlike many automated learning procedures, which lack interpretability and operate as a "black box", ARTSS allows us to (1) estimate the marginal effect smoothly; (2) test the significance of non-additive effects; (3) provide a measure of relative variable importance on main effects and interactions; (4) select variables and/or incorporating hierarchical structure in modeling. Finally, we apply ARTSS to two large public domain data sets and discuss the understanding developed from the model.
Committee
Prem Goel (Advisor)
Subject Headings
Statistics
Keywords
Bayesian robustness
;
Boosting
;
Flat-tailed prior distribution
;
Interpretation
;
MART
;
Statistical learning
Recommended Citations
Refworks
EndNote
RIS
Mendeley
Citations
Li, B. (2006).
Statistical learning and predictive modeling in data mining
[Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1155058111
APA Style (7th edition)
Li, Bin.
Statistical learning and predictive modeling in data mining.
2006. Ohio State University, Doctoral dissertation.
OhioLINK Electronic Theses and Dissertations Center
, http://rave.ohiolink.edu/etdc/view?acc_num=osu1155058111.
MLA Style (8th edition)
Li, Bin. "Statistical learning and predictive modeling in data mining." Doctoral dissertation, Ohio State University, 2006. http://rave.ohiolink.edu/etdc/view?acc_num=osu1155058111
Chicago Manual of Style (17th edition)
Abstract Footer
Document number:
osu1155058111
Download Count:
904
Copyright Info
© 2006, all rights reserved.
This open access ETD is published by The Ohio State University and OhioLINK.