Introduction
- Cost-to-go Approximations in Dynamic Programming
- Approximation Architectures
- Simulation and Training
- Neuro-Dynamic Programming
- Notes and Sources
Dynamic Programming
- Introduction
- Stochastic Shortest Path Problems
- Discounted Problems
- Problem Formulation and Examples
- Notes and Sources
Neural Network Architectures and Training
- Architectures for Approximation
- Neural Network Training
- Notes and Sources
Stochastic Iterative Algorithms
- The Basic Model
- Convergence Based on a Smooth Potential Function
- Convergence under Contraction or Monotonicity Assumptions
- The ODE Approach
- Notes and Sources
Simulation Methods for a Lookup Table
Representation
- Some Aspects of Monte Carlo Simulation
- Policy Evaluation by Monte Carlo Simulation
- Temporal Difference Methods
- Optimistic Policy Iteration
- Simulation-Based Value Iteration
- Q-Learning
- Notes and Sources
Approximate DP with Cost-to-Go Function
Approximation
- Generic Issues - From Parameters to Policies
- Approximate Policy Iteration
- Approximate Policy Evaluation Using TD(lambda)
- Optimistic Policy Iteration
- Approximate Value Iteration
- Q-Learning and Advantage Updating
- Value Iteration with State Aggregation
- Euclidean Contractions and Optimal Stopping
- Value Iteration with Representative States
- Bellman Error Methods
- Continuous States and the Slope of the Cost-to-Go
- Approximate Linear Programming
- Overview
- Notes and Sources
Extensions
- Average Cost per Stage Problems
- Dynamic Games
- Parallel Computation Issues
- Notes and Sources
Case Studies
- Parking
- Football
- Tetris
- Combinatorial Optimization - Maintenance and Repair
- Dynamic Channel Allocation
- Backgammon
- Notes and Sources
Appendix A: Mathematical
Review
Appendix B: On Probability Theory and Markov Chains
References
Index