Efficient Online Learning with Bandit Feedback

Liu, Fang

Keyword Search

School Logo

thesis.pdf (1.56 MB)

Efficient Online Learning with Bandit Feedback

Author Info

Liu, Fang

Permalink:

http://rave.ohiolink.edu/etdc/view?acc_num=osu1587680990430268

Year and Degree

2020, Doctor of Philosophy, Ohio State University, Electrical and Computer Engineering.

Abstract

Online learning has been widely used in modern machine learning systems. In spite of the recent progress in online learning, many challenges remain unsolved when online learning is applied to the complex practical scenarios. The key challenges can be summarized from the following perspectives. First, curse-of-dimensionality of the action space: to learn the optimal action, the learning algorithm needs to explore all the actions to some extent. Thus, large-scale action space makes the learning algorithm converge slowly and incur large cost in the cold-start cases. Consider the online recommendation systems, the number of users or the number of items is a large number compared to the finite exploration budget. Second, non-stationary environments: real-world learning systems are highly dynamic, which is reflected in the fact that users' preferences change over time due to various internal or external factors, and item popularity vary due to fast emerging events. Failing to be adaptive to the changing environment may lead to sub-optimal decisions. Third, computational resource constraints: online learning algorithms need to be updated with the arriving data stream, which may require heavy computations for each arriving new data. The frequent updates cost a large amount of computation resources and prevent online learning tools from real-time applications. In this dissertation, we aim at developing efficient online learning solutions, and more specifically multi-armed bandit solutions, to conquer the aforementioned challenges. First, we study the correlations between actions in order to reduce the action space complexity of the online learning problem. We start with a special case of correlations, that playing one action also generates the side observations of other actions. Then we generalize the result to a general form of correlations over actions. Second, we consider multi-armed bandit in a non-stationary environment such that the learning algorithm can automatically detect the potential changes and adapt its decision making strategy accordingly. Finally, we shift our focus on the trade-off between computational complexity and optimality of the bandit algorithms. Our goal is to design a class of bandit algorithms that can be computed efficiently and also behaves closely to the optimal algorithms. By combining the proposed research, an online learning system can be more efficient in large-scale applications, more adaptive to the changing world, and cost less computational resources. The proposed solutions can be applied to a wide spectrum of applications including the recommendation systems, viral marketing, communication networks, clinical trials in healthcare, sequential experiment design, and many more.

Committee

Ness Shroff (Advisor)
Eylem Ekici (Committee Member)
Yingbin Liang (Committee Member)
Abhishek Gupta (Committee Member)

Pages

219 p.

Subject Headings

Computer Science; Electrical Engineering

Keywords

machine learning; online learning; multi-armed bandits

Liu, F. (2020). Efficient Online Learning with Bandit Feedback [Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1587680990430268
APA Style (7th edition)
Liu, Fang. Efficient Online Learning with Bandit Feedback. 2020. Ohio State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu1587680990430268.
MLA Style (8th edition)
Liu, Fang. "Efficient Online Learning with Bandit Feedback." Doctoral dissertation, Ohio State University, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=osu1587680990430268
Chicago Manual of Style (17th edition)

Document number:

osu1587680990430268

Download Count:

527

Copyright Info

Global Search Box

Files

File List

ETD Abstract Container

Abstract Header

Efficient Online Learning with Bandit Feedback

Abstract Details

Recommended Citations

Citations

Abstract Footer

Global Footer

Ohio Department of Higher Education

State Government Links

Education Links

Global Search Box

Files

File List

ETD Abstract Container

Abstract Header

Efficient Online Learning with Bandit Feedback

Abstract Details

Recommended CitationsRefworksEndNoteRISMendeley

Citations

Abstract Footer

Global Footer

Ohio Department of Higher Education

State Government Links

Education Links

Recommended Citations