Skip to Main Content
Frequently Asked Questions
Submit an ETD
Global Search Box
Need Help?
Keyword Search
Participating Institutions
Advanced Search
School Logo
Files
File List
gdoran_dissertation.pdf (2.86 MB)
ETD Abstract Container
Abstract Header
Multiple-Instance Learning from Distributions
Author Info
Doran, Gary Brian, Jr.
Permalink:
http://rave.ohiolink.edu/etdc/view?acc_num=case1417736923
Abstract Details
Year and Degree
2015, Doctor of Philosophy, Case Western Reserve University, EECS - Computer and Information Sciences.
Abstract
I propose a new theoretical framework for analyzing the multiple-instance learning (MIL) setting. In MIL, training examples are provided to a learning algorithm in the form of labeled sets, or "bags," of instances. Applications of MIL include 3-D quantitative structure-activity relationship prediction for drug discovery and content-based image retrieval for web search. The goal of an algorithm is to learn a function that correctly labels new bags or a function that correctly labels new instances. I propose that bags should be treated as latent distributions from which samples are observed. I show that it is possible to learn accurate instance- and bag-labeling functions in this setting as well as functions that correctly rank bags or instances under weak assumptions. Additionally, my theoretical results suggest that it is possible learn to rank efficiently using traditional, well-studied "supervised" learning approaches. These results also indicate that supervised approaches for learning from distributions can be used to directly learn bag-labeling functions efficiently. I perform an extensive empirical evaluation that supports the theoretical predictions entailed by the new framework. In addition to showing how supervised approaches can be applied to MIL, I prove new hardness results on using MI-specific algorithms to learn hyperplane labeling functions for instances. Finally, I propose a new resampling approach for MIL, analyze it under the new theoretical framework, and show that it can improve the performance of MI classifiers when training set sizes are small. In summary, the proposed theoretical framework leads to a better understanding of the relationship between the MI and standard supervised learning settings, and it provides new methods for learning from MI data that are more accurate, more efficient, and have better understood theoretical properties than existing MI-specific algorithms.
Committee
Soumya Ray (Advisor)
Harold Connamacher (Committee Member)
Michael Lewicki (Committee Member)
Stanislaw Szarek (Committee Member)
Kiri Wagstaff (Committee Member)
Pages
248 p.
Subject Headings
Artificial Intelligence
;
Computer Science
Keywords
machine learning
;
multiple-instance learning
;
kernel methods
;
learning theory
;
classfiication
Recommended Citations
Refworks
EndNote
RIS
Mendeley
Citations
Doran, Jr., G. B. (2015).
Multiple-Instance Learning from Distributions
[Doctoral dissertation, Case Western Reserve University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=case1417736923
APA Style (7th edition)
Doran, Jr., Gary.
Multiple-Instance Learning from Distributions.
2015. Case Western Reserve University, Doctoral dissertation.
OhioLINK Electronic Theses and Dissertations Center
, http://rave.ohiolink.edu/etdc/view?acc_num=case1417736923.
MLA Style (8th edition)
Doran, Jr., Gary. "Multiple-Instance Learning from Distributions." Doctoral dissertation, Case Western Reserve University, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=case1417736923
Chicago Manual of Style (17th edition)
Abstract Footer
Document number:
case1417736923
Download Count:
2,512
Copyright Info
© 2014, all rights reserved.
This open access ETD is published by Case Western Reserve University School of Graduate Studies and OhioLINK.