Reinforcement Learning Algorithms: Acceleration Design and Non-asymptotic Theory

Xiong, Huaqing

Keyword Search

School Logo

Dissertation_Huaqing Xiong.pdf (2.28 MB)

Reinforcement Learning Algorithms: Acceleration Design and Non-asymptotic Theory

Author Info

Xiong, Huaqing

Permalink:

http://rave.ohiolink.edu/etdc/view?acc_num=osu1617960067796074

Year and Degree

2021, Doctor of Philosophy, Ohio State University, Electrical and Computer Engineering.

Abstract

Reinforcement learning (RL) aims to design strategies for an agent to find a desirable policy through interacting with an environment in order to maximize an accumulative reward for a task. RL has received drastically growing attention in recent years and accomplished tremendous success in various application domains. Yet, existing RL algorithms can still suffer from poor performance such as slow convergence, high variance, suboptimality, etc. As a result, important implementation techniques have been developed to overcome these issues, including the momentum schemes, the adaptive moment estimation (Adam) method, and the double Q-learning architecture. Thus, the first focus of this thesis is on further development of novel techniques to improve the performance of RL algorithms. Furthermore, in contrast to the wide applications of these techniques in RL, the understanding of their theoretical performance is rather limited. Thus, the second focus of this dissertation is on providing convergence analysis for RL algorithms that adopt the popular acceleration techniques including 1) momentum schemes, 2) Adam-type schemes, and 3) double Q-learning architecture. First, we propose a new momentum scheme and apply it to Q-learning under both the tabular and function approximation settings. We further show that the proposed momentum-based Q-learning algorithms achieve a better convergence rate than the existing momentum type Q-learning. Our experimental results also show that the proposed algorithms outperform other existing algorithms of the same type. Second, we establish the first convergence guarantee for Adam-type RL algorithms. In particular, we apply AMSGrad (a convergent variant of Adam) updates to temporal difference (TD) learning and policy gradient (PG) and establish their finite-time results under a practical Markovian sampling strategy. In addition, we also study Q-learning with AMSGrad updates and propose a new Adam-type Q-learning algorithm which adopts the momentum restart technique. Our numerical results show that the proposed Adam-type Q-learning algorithm outperforms the popular deep Q-network (DQN) method. Last, we provide the first finite-time convergence analysis for double Q-learning. We model the double Q-learning algorithm as a nested stochastic approximation (SA) problem, and develop new techniques to analyze it. We provide the non-asymptotic convergence results for both synchronous and asynchronous double Q-learning under polynomial learning rates, and further improve the results by adopting constant learning rates.

Committee

Yingbin Liang (Advisor)
Wei Zhang (Committee Member)
Philip Schniter (Committee Member)
Hesham El Gamal (Committee Member)

Pages

262 p.

Subject Headings

Artificial Intelligence

Xiong, H. (2021). Reinforcement Learning Algorithms: Acceleration Design and Non-asymptotic Theory [Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1617960067796074
APA Style (7th edition)
Xiong, Huaqing. Reinforcement Learning Algorithms: Acceleration Design and Non-asymptotic Theory. 2021. Ohio State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu1617960067796074.
MLA Style (8th edition)
Xiong, Huaqing. "Reinforcement Learning Algorithms: Acceleration Design and Non-asymptotic Theory." Doctoral dissertation, Ohio State University, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=osu1617960067796074
Chicago Manual of Style (17th edition)

Document number:

osu1617960067796074

Download Count:

295

Copyright Info

Global Search Box

Files

File List

ETD Abstract Container

Abstract Header

Reinforcement Learning Algorithms: Acceleration Design and Non-asymptotic Theory

Abstract Details

Recommended Citations

Citations

Abstract Footer

Global Footer

Ohio Department of Higher Education

State Government Links

Education Links

Global Search Box

Files

File List

ETD Abstract Container

Abstract Header

Reinforcement Learning Algorithms: Acceleration Design and Non-asymptotic Theory

Abstract Details

Recommended CitationsRefworksEndNoteRISMendeley

Citations

Abstract Footer

Global Footer

Ohio Department of Higher Education

State Government Links

Education Links

Recommended Citations