by Dimitri P. Bertsekas
Publication: November 2023, 421 pages, hardcover
EBOOK at Google Play
This textbook is an outgrowth of a course in Reinforcement Learning (RL) that the author has offered in the years 2019-2023 at Arizona State University. The purpose of the course is to give an overview of the RL methodology, particularly as it relates to problems of optimal and suboptimal decision and control, as well as discrete optimization.
The book takes the view that RL rests on the analytical and computational sequential decision making foundation of Dynamic Programming (DP). It provides methodologies for approximatiing optimal values and policies, using neural networks among other approaches. Much of the development rests on visualization and intuition, with references to the mathematical analysis provided in earlier RL and DP books by the author. The textbook contains many examples and end-of-chapter exercises, most of which are solved.
An important structural characteristic of the textbook is that it is organized in a modular way, with a view towards flexibility, so it can be easily modified to accommodate changes in course content. In particular, the textbook is divided in two parts:
(1) A foundational platform, which consists of Chapter 1. It contains a selective overview of the approximate DP/RL landscape, and a starting point for a more detailed in-class development of other RL topics, whose choice can be at the instructor's discretion.
(2) An in-depth coverage of the methodologies of deterministic and stochastic rollout in Chapter 2, and of the use of neural networks and other approximation architectures for off-line training of values and policies in Chapter 3.
In a different course, alternative choices for in-depth coverage may be made, using the same foundational platform. In particular, both more and less mathematically-oriented courses can be built upon the platform of Chapter 1. Videolectures and slides from the course are available from the author?s website.
The book is an excellent supplement to our other Dynamic Programming, Abstract Dynamic Programming, Reinforcement Learning, and Rollout books.
Dimitri P. Bertsekas is Fulton Professor of Computational Decision Making at the Arizona State University, McAfee Professor of Engineering at the Massachusetts Institute of Technology, and a member of the prestigious United States National Academy of Engineering. He is the recipient of the 2001 A. R. Raggazini ACC education award and the 2009 INFORMS expository writing award. He has also received 2014 ACC Richard E. Bellman Control Heritage Award for "contributions to the foundations of deterministic and stochastic optimization-based methods in systems and control," the 2014 Khachiyan Prize for Life-Time Accomplishments in Optimization, the SIAM/MOS 2015 George B. Dantzig Prize, and the 2022 IEEE Control Systems Award. Together with his coauthor John Tsitsiklis, he was awarded the 2018 INFORMS John von Neumann Theory Prize, for the contributions of the research monographs "Parallel and Distributed Computation" and "Neuro-Dynamic Programming".
Videolectures, Slides, and Other Course Material from ASU classes, 2019-2023.