Skip to Main Content
Frequently Asked Questions
Submit an ETD
Global Search Box
Need Help?
Keyword Search
Participating Institutions
Advanced Search
School Logo
Files
File List
latham_thesis.pdf (2.01 MB)
ETD Abstract Container
Abstract Header
Multiple-Instance Feature Ranking
Author Info
Latham, Andrew C
Permalink:
http://rave.ohiolink.edu/etdc/view?acc_num=case1440642294
Abstract Details
Year and Degree
2016, Master of Sciences, Case Western Reserve University, EECS - Electrical Engineering.
Abstract
Multiple-instance learning is a subfield of machine learning in which training data is provided as labeled sets of instances called "bags," with the instance labels themselves unknown. Multiple-instance learning has many important practical applications, including the drug discovery, image retrieval, and text classification problems. In this thesis I will investigate the problem of feature ranking, where the most important features are determined based on some evaluation metric, in the context of multiple-instance learning. In order to rank features well, the instance labels are required; however, in multiple-instance learning, only the labels of the bags are available. I will investigate the implications of giving every instance the label of its bag, and using the resulting supervised dataset to evaluate and rank features. Even though this introduces additional one-sided noise to the data set, I provide a theoretical analysis that shows that in many situations a relevant set of features can be recovered, where "relevant" is defined using the feature set that would be found if the true instance labels were available, removing the limitations of the multiple-instance learning problem entirely. I describe the factors that control when a good feature ranking can be learned when both accuracy and AUC are used as scoring functions. Finally, I evaluate the performance of several dimensionality reduction algorithms on a number of multiple-instance learning datasets and find that, in addition to being fast and relatively simple, the algorithms proposed in this thesis are competitive with existing techniques and can often improve the performance of a classifier.
Committee
Soumya Ray (Advisor)
Harold Connamacher (Committee Member)
Michael Lewicki (Committee Member)
Pages
123 p.
Subject Headings
Computer Science
Keywords
Machine Learning
;
Feature Selection
;
Feature Ranking
;
Multiple-Instance Learning
Recommended Citations
Refworks
EndNote
RIS
Mendeley
Citations
Latham, A. C. (2016).
Multiple-Instance Feature Ranking
[Master's thesis, Case Western Reserve University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=case1440642294
APA Style (7th edition)
Latham, Andrew.
Multiple-Instance Feature Ranking.
2016. Case Western Reserve University, Master's thesis.
OhioLINK Electronic Theses and Dissertations Center
, http://rave.ohiolink.edu/etdc/view?acc_num=case1440642294.
MLA Style (8th edition)
Latham, Andrew. "Multiple-Instance Feature Ranking." Master's thesis, Case Western Reserve University, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=case1440642294
Chicago Manual of Style (17th edition)
Abstract Footer
Document number:
case1440642294
Download Count:
971
Copyright Info
© 2016, all rights reserved.
This open access ETD is published by Case Western Reserve University School of Graduate Studies and OhioLINK.