Skip to Main Content
 

Global Search Box

 
 
 
 

Files

ETD Abstract Container

Abstract Header

Ontology-based Feature Construction on Non-structured Data

Abstract Details

2015, PhD, University of Cincinnati, Engineering and Applied Science: Mechanical Engineering.
Data mining on non-structured data is a relatively under-researched area because most efforts in the KDD community in the last decades are devoted to mining relational structured data. Thanks to the information explosion in the big-data era, the majority of knowledge is emerging in various forms of non-structured data. This necessitates new methodologies of constructing meaningful features from non-structured data to facilitate knowledge learning. Most existing data-driven methods only serve the objective of improving feature discriminative power, while severely underestimate the importance of interpretability. In many domains, the discovery and learning of new hypotheses and knowledge in a meaningful and understandable form from non-structured data is the prime aim. In this study, an ontology-based feature construction framework is proposed. The framework presents the structural relations embedded with domain knowledge in the form of ontology. Features of 3 levels are defined based on the granularity in ontology. A feature, representing a domain hypothesis, can be readily constructed by evolving ontology. Support and confidence are two criteria proposed to evaluate the usefulness of the features in support of searching for optimal ones. Furthermore, in an interactive way, domain experts are involved to explore new hypotheses with the aid of data-driven heuristic algorithms. Also, ontology is highly flexible to be reconstructed in order to accommodate different hypotheses. A comprehensive case study is conducted in which the proposed methodology is applied on a miscellaneous medical claim data to build features that are both interpretable and highly predictive for hospitalization forecast. A medical professor are constantly consulted to bring in domain insights to aid ontology evolution and assess meaningfulness of the constructed features and prediction. The constructed features outperform those based on initial hypotheses in terms of prediction accuracy. Moreover, the ability of discovering new and useful knowledge is demonstrated by the meaningfulness of the new features and evolved ontology.
Hongdao Huang, Ph.D. (Committee Chair)
Lawson Wulsin, M.D. (Committee Member)
Nuo Xu, Ph.D. (Committee Member)
Sundararaman Anand, Ph.D. (Committee Member)
David Thompson, Ph.D. (Committee Member)
95 p.

Recommended Citations

Citations

  • Ni, W. (2015). Ontology-based Feature Construction on Non-structured Data [Doctoral dissertation, University of Cincinnati]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1439309340

    APA Style (7th edition)

  • Ni, Weizeng. Ontology-based Feature Construction on Non-structured Data. 2015. University of Cincinnati, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=ucin1439309340.

    MLA Style (8th edition)

  • Ni, Weizeng. "Ontology-based Feature Construction on Non-structured Data." Doctoral dissertation, University of Cincinnati, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1439309340

    Chicago Manual of Style (17th edition)