Skip to Main Content
 

Global Search Box

 
 
 
 

ETD Abstract Container

Abstract Header

Multiple-Instance Learning from Distributions

Doran, Gary Brian, Jr.

Abstract Details

2015, Doctor of Philosophy, Case Western Reserve University, EECS - Computer and Information Sciences.
I propose a new theoretical framework for analyzing the multiple-instance learning (MIL) setting. In MIL, training examples are provided to a learning algorithm in the form of labeled sets, or "bags," of instances. Applications of MIL include 3-D quantitative structure-activity relationship prediction for drug discovery and content-based image retrieval for web search. The goal of an algorithm is to learn a function that correctly labels new bags or a function that correctly labels new instances. I propose that bags should be treated as latent distributions from which samples are observed. I show that it is possible to learn accurate instance- and bag-labeling functions in this setting as well as functions that correctly rank bags or instances under weak assumptions. Additionally, my theoretical results suggest that it is possible learn to rank efficiently using traditional, well-studied "supervised" learning approaches. These results also indicate that supervised approaches for learning from distributions can be used to directly learn bag-labeling functions efficiently. I perform an extensive empirical evaluation that supports the theoretical predictions entailed by the new framework. In addition to showing how supervised approaches can be applied to MIL, I prove new hardness results on using MI-specific algorithms to learn hyperplane labeling functions for instances. Finally, I propose a new resampling approach for MIL, analyze it under the new theoretical framework, and show that it can improve the performance of MI classifiers when training set sizes are small. In summary, the proposed theoretical framework leads to a better understanding of the relationship between the MI and standard supervised learning settings, and it provides new methods for learning from MI data that are more accurate, more efficient, and have better understood theoretical properties than existing MI-specific algorithms.
Soumya Ray (Advisor)
Harold Connamacher (Committee Member)
Michael Lewicki (Committee Member)
Stanislaw Szarek (Committee Member)
Kiri Wagstaff (Committee Member)
248 p.

Recommended Citations

Citations

  • Doran, Jr., G. B. (2015). Multiple-Instance Learning from Distributions [Doctoral dissertation, Case Western Reserve University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=case1417736923

    APA Style (7th edition)

  • Doran, Jr., Gary. Multiple-Instance Learning from Distributions. 2015. Case Western Reserve University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=case1417736923.

    MLA Style (8th edition)

  • Doran, Jr., Gary. "Multiple-Instance Learning from Distributions." Doctoral dissertation, Case Western Reserve University, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=case1417736923

    Chicago Manual of Style (17th edition)