Skip to Main Content
 

Global Search Box

 
 
 

ETD Abstract Container

Abstract Header

Verifying Value Iteration and Policy Iteration in Coq

Abstract Details

2021, Master of Science (MS), Ohio University, Electrical Engineering & Computer Science (Engineering and Technology).
Reinforcement learning is a growing field of research, but little work is being done to verify the correctness of reinforcement learning algorithms. Researchers are exploring the use of reinforcement learning in safety critical systems such as self-driving cars and autonomous aircraft, so mathematical proofs of correctness of the underlying reinforcement learning algorithms would greatly improve our confidence in the systems that utilize reinforcement learning. This project verifies convergence and optimality of two fundamental reinforcement learning algorithms: value iteration and policy iteration. These algorithms converge and are optimal if they eventually produce an optimal policy. It also is designed to be extensible to future research into verified reinforcement learning.
Gordon Stewart (Advisor)
Bunescu Razvan (Committee Member)
Juedes David (Committee Member)
Elster Charlotte (Committee Member)
63 p.

Recommended Citations

Citations

  • Masters, D. M. (2021). Verifying Value Iteration and Policy Iteration in Coq [Master's thesis, Ohio University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1618999718015199

    APA Style (7th edition)

  • Masters, David. Verifying Value Iteration and Policy Iteration in Coq. 2021. Ohio University, Master's thesis. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1618999718015199.

    MLA Style (8th edition)

  • Masters, David. "Verifying Value Iteration and Policy Iteration in Coq." Master's thesis, Ohio University, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1618999718015199

    Chicago Manual of Style (17th edition)