Active Learning for Ranking from Noisy Observations

Ren, Wenbo

Keyword Search

School Logo

Dissertation_Wenbo.pdf (1.15 MB)

Active Learning for Ranking from Noisy Observations

Author Info

Ren, Wenbo

ORCID® Identifier

http://orcid.org/0000-0001-7832-5604

Permalink:

http://rave.ohiolink.edu/etdc/view?acc_num=osu162740833871401

Year and Degree

2021, Doctor of Philosophy, Ohio State University, Computer Science and Engineering.

Abstract

Ranking is a fundamental problem that has been extensively studied in the machine learning community due to its broad applicability to different application areas such as recommendation, web searching, social choices, and crowd sourcing. The problems of ranking focus on finding the full ranking or partial rankings (e.g., top-$k$ selection) of a set of items from noisy observations. The items may refer to choices, movies, pages, and advertisements and the observations may refer to the queries about the users' preferences on these items. In many systems, the observations can be done in an active (or adaptive) manner, i.e., for each observation, the learner can adaptively choose items to observe according to past observations. An interesting problem is to understand the number of observations needed (aka sample complexity) for finding the ranking under the active setting. In this dissertation, we present new algorithms and derive new sample complexity upper and lower bounds that improve the state-of-the-art for three significant and fundamental problems in this topic. This dissertation explores the following problems: I. Exploring top-fraction arms in stochastic bandits. The stochastic multi-armed bandit (MAB) model is a classical online learning model for studying sequential decision making under uncertainty. In an MAB model, there are multiple arms (each refers to an action or an item), and each sample (aka pull) of an arm returns an independent reward according to this arm's unknown latent distribution. The first problem is to find $k$ distinct arms that are approximately within the top-$\rho$ fraction with respect to the mean rewards of the arms with error tolerance $\epsilon$ and confidence $1-\delta$. We develop algorithms and analyze the sample complexity lower and upper bounds for four variations (the number of arms is infinite or finite; the threshold of the top-$\rho$ fraction is known or unknown). The upper bounds and lower bounds of all four variations are optimal up to at most logarithmic factors. II. Exact full ranking from noisy comparisons. The problem we investigate is to find the exact full ranking of a set of items from noisy comparisons with confidence $1-\delta$, where a comparison is a process over two (pairwise comparisons) or multiple (multi-wise comparisons) items that outputs a noisy result about the (most) preferred one. We only make three standard or mild assumptions: (i) comparisons are independent across time and items; (ii) there is a strict order of these items; and (iii) the items satisfy the weak stochastic transitivity. We prove a nearly optimal sample complexity lower bound for this problem, and develop an algorithm whose sample complexity upper bound matches the lower bound (up to a constant factor) under mild conditions. III. Top-$k$ items selection from pairwise comparisons. In this problem, we want to study how to find the top-$k$ items from noisy pairwise comparisons under the strong stochastic transitivity and the stochastic triangle inequality. We study two cases: i) finding the probably approximately correct (PAC) top-$k$ items and ii) finding the exact top-$k$ items. For both cases, we develop algorithms and derive new lower bounds. For the PAC top-$k$ selection, our upper and lower bounds are optimal up to a constant factor. For the exact top-$k$ selection, our upper and lower bounds are optimal up to a loglog factor for $k = 1$ and optimal up to a log factor for $k > 1$.

Committee

Ness Shroff (Advisor)
Jia Liu (Advisor)
Raef Bassily (Committee Member)
Shaileshh Bojja Venkatakrishnan (Committee Member)

Pages

214 p.

Subject Headings

Artificial Intelligence; Computer Science

Keywords

Ranking; Preference Learning; Active Learning; Learning Theory; Multi-Armed Bandits; Pure Exploration; Reinforcement Learning

Ren, W. (2021). Active Learning for Ranking from Noisy Observations [Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu162740833871401
APA Style (7th edition)
Ren, Wenbo. Active Learning for Ranking from Noisy Observations. 2021. Ohio State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu162740833871401.
MLA Style (8th edition)
Ren, Wenbo. "Active Learning for Ranking from Noisy Observations." Doctoral dissertation, Ohio State University, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=osu162740833871401
Chicago Manual of Style (17th edition)

Document number:

osu162740833871401

Download Count:

81

Copyright Info

Global Search Box

Files

File List

ETD Abstract Container

Abstract Header

Active Learning for Ranking from Noisy Observations

Abstract Details

Recommended Citations

Citations

Abstract Footer

Global Footer

Ohio Department of Higher Education

State Government Links

Education Links

Global Search Box

Files

File List

ETD Abstract Container

Abstract Header

Active Learning for Ranking from Noisy Observations

Abstract Details

Recommended CitationsRefworksEndNoteRISMendeley

Citations

Abstract Footer

Global Footer

Ohio Department of Higher Education

State Government Links

Education Links

Recommended Citations