Abstract Dynamic Programming, 3rd Edition

ISBN: 1-886529-47-7, 978-1-886529-47-2
Publication: January 2022, 420 pages, hardcover
Price: $65.00

EBOOK at Google Play

A research monograph providing a synthesis of old research on the foundations of dynamic programming, with the modern theory of approximate dynamic programming and new research on semicontractive models.

It aims at a unified and economical development of the core theory and algorithms of total cost sequential decision problems, based on the strong connections of the subject with fixed point theory. The analysis focuses on the abstract mapping that underlies dynamic programming and defines the mathematical character of the associated problem. The discussion centers on two fundamental properties that this mapping may have: monotonicity and (weighted sup-norm) contraction. It turns out that the nature of the analytical and algorithmic DP theory is determined primarily by the presence or absence of these two properties, and the rest of the problem's structure is largely inconsequential. New research is focused on two areas: 1) The ramifications of these properties in the context of algorithms for approximate dynamic programming, and 2) The new class of semicontractive models, exemplified by stochastic shortest path problems, where some but not all policies are contractive.

The 2nd edition aims primarily to amplify the presentation of the semicontractive models of Chapter 3 and Chapter 4 of the first (2013) edition, and to supplement it with a broad spectrum of research results that I obtained and published in journals and reports since the first edition was written (see below). As a result, the size of this material more than doubled, and the size of the book increased by nearly 40%.

The 3rd edition is similar to the 2nd edition, but includes a new chapter on abstract DP methods for minimax and zero sum game problems, which is based on the author's recent work. A primary motivation is the resolution of some long-standing convergence difficulties of the "natural" policy iteration algorithm, which have been known since the Pollatschek and Avi-Itzhak method for finite-state Markov games. Mathematically, this "natural" algorithm is a form of Newton's method for solving the corresponding Bellman's equation, but Newton's method, contrary to the case of single-player DP problems, is not globally convergent in the case of a minimax problem, because the Bellman operator may have components that are neither convex nor concave. Our approach as been to introduce a special type of abstract Bellman operator for minimax problems, which involves a parametric contraction mapping with a uniform fixed point.

The book is an excellent supplement to our Dynamic Programming and Reinforcement Learning books, and particularly our Lessons from AlphaZero for Optimal, Model Predictive, and Adaptive Control book.

From the review by Panos Pardalos (Optimization Methods and Software):

"The mathematical development in this book is elegant and mathematically rigorous, relying on the power of abstraction to focus on the fundamentals. The monograph provides for the first time a comprehensive synthesis of the field, while presenting much new research, some of which is relevant to currently very active fields such as approximate dynamic programming. Many examples are sprinkled through the book, illustrating the unifying power of the theory and applying it to specific types of problems, such as discounted, stochastic shortest path, semi-Markov, minimax, sequential games, multiplicative, and risk-sensitive models. The book also includes end-of-chapter exercises (with complete solutions), which supplement the text with examples, counterexamples, and theoretical extensions."

"Like several other books by Bertsekas, this book is well-written, and well-suited for self-study. It can be used as a supplement to graduate dynamic programming classes. I highly recommend this book for everyone interested in dynamic optimization."

From the review by Eugene Feinberg at Amazon.com:

"Dimitri Bertsekas is also the author of "Dynamic Programming and Optimal Control," Athena Scientific, 2007, a comprehensive text in which most of the dynamic programming concepts and applications are presented in a way interesting and available to a large spectrum of readers from undergraduate students in business and engineering to researches in the field. He wrote many other wonderful books on dynamic programming, data networks, probability, numerical methods, and optimization. His books are well written, informative, and easy to read. This book carries all these merits."

"Historical notes, sources, and exercises are also included at the end of each chapter, and solutions to all exercises are provided. This, and the lucid exposition, makes this book ideal both for self-study and as a supplement for graduate-level courses/texts in dynamic programming or reinforcement learning. In addition, readers interested in pursuing research in dynamic programming can find new research directions mentioned in the introduction of the book."

Among its features, the book:

develops the algorithmic foundations of approximate dynamic programming within an abstract unifying framework
discusses algorithms for approximate dynamic programming within a broadly applicable setting
develops the connections of dynamic programming with fixed point theory
contains significant new research

Dimitri P. Bertsekas is McAfee Professor of Engineering at the Massachusetts Institute of Technology and a member of the prestigious United States National Academy of Engineering. He is the recipient of the 2001 A. R. Raggazini ACC education award and the 2009 INFORMS expository writing award. He has also received 2014 ACC Richard E. Bellman Control Heritage Award for "contributions to the foundations of deterministic and stochastic optimization-based methods in systems and control," the 2014 Khachiyan Prize for Life-Time Accomplishments in Optimization, the SIAM/MOS 2015 George B. Dantzig Prize, and the 2022 IEEE Control Systems Award. Together with his coauthor John Tsitsiklis, he was awarded the 2018 INFORMS John von Neumann Theory Prize, for the contributions of the research monographs "Parallel and Distributed Computation" and "Neuro-Dynamic Programming".

Supplementary Material:

The following documents have a strong connection to the book, and amplify on the analysis and the range of applications of the semicontractive models of Chapters 3 and 4:

D. P. Bertsekas, "Regular Policies in Abstract Dynamic Programming", Lab. for Information and Decision Systems Report LIDS-P-3173, MIT, May 2015 (revised August, 2016); to appear in SIAM J. on Control and Optimization (Related Lecture Slides); (Related Video Lectures).
D. P. Bertsekas, "Value and Policy Iteration in Deterministic Optimal Control and Adaptive Dynamic Programming", Lab. for Information and Decision Systems Report LIDS-P-3174, MIT, May 2015 (revised Sept. 2015); IEEE Transactions on Neural Networks and Learning Systems, Vol. 28, 2017, pp. 500-509.
D. P. Bertsekas and H. Yu, "Stochastic Shortest Path Problems Under Weak Conditions", Lab. for Information and Decision Systems Report LIDS-P-2909, MIT, January 2016.
D. P. Bertsekas, "Robust Shortest Path Planning and Semicontractive Dynamic Programming", Lab. for Information and Decision Systems Report LIDS-P-2915, MIT, Jan. 2016; revised in June 2016; to appear in Naval Research Logistics.
D. P. Bertsekas, "Affine Monotonic and Risk-Sensitive Models in Dynamic Programming", Lab. for Information and Decision Systems Report LIDS-3204, MIT, June 2016; to appear in IEEE Transactions on Aut. Control.
D. P. Bertsekas, "Stable Optimal Control and Semicontractive Dynamic Programming", Lab. for Information and Decision Systems Report LIDS-P-3506, MIT, May 2017; to appear in SIAM J. on Control and Optimization (Related Lecture Slides).
D. P. Bertsekas, Proper Policies in Infinite-State Stochastic Shortest Path Problems", Lab. for Information and Decision Systems Report LIDS-3507, MIT, May 2017; to appear in IEEE Transactions on Aut. Control. (Related Lecture Slides).
An updated version of Chapter 4 of the author's Dynamic Programming book, Vol. II, which incorporates recent research on a variety of undiscounted problems and relates to abstract DP topics; (Related Lecture Slides).
Chapter 5 and Appendix C of 1st Edition, which deals with restricted policies, and is motivated in part by the complex measurability questions that arise in mathematically rigorous theories of stochastic optimal control with Borel state and control spaces. This material is covered in Chapter 6 of the monograph by Bertsekas and Shreve, and followup research on the subject has been limited. Thus it was decided to just post here Chapter 5 and Appendix C of the first edition, and omit them from the second edition.

[Return to Athena Scientific Homepage]

Abstract Dynamic Programming, 3rd Edition

Related Videos and Slides:

Supplementary Material: