Convex Optimization Theory
Preface:
This book aims at an accessible, concise, and intuitive exposition of two related subjects that find broad practical application:
(a) Convex analysis, particularly as it relates to optimization.
(b) Duality theory for optimization and minimax problems, mainly within a convexity framework.
The focus on optimization is to derive conditions for existence of primal and dual optimal solutions for constrained problems such as
minimize f(x) subject to x\in X, g_j(x)<= 0, j=1,...,r.
Other types of optimization problems, such as those arising in Fenchel duality, are also part of our scope. The focus on minimax is to derive conditions guaranteeing the equality
inf_x sup_z phi(x,z) = sup_z inf_x phi(x,z),
and the attainment of the ``inf" and the ``sup."
The treatment of convexity theory is fairly detailed. It touches upon nearly all major aspects of the subject, and it is sufficient for the development of the core analytical issues of convex optimization. The mathematical prerequisites are a first course in linear algebra and a first course in real analysis. A summary of the relevant material is provided in an appendix. Prior knowledge of linear and nonlinear optimization theory is not assumed, although it will undoubtedly be helpful in providing context and perspective. Other than this modest background, the development is self-contained, with rigorous proofs provided throughout.
We have aimed at a unified development of the strongest possible forms of duality with the most economic use of convexity theory. To this end, our analysis often departs from the lines of Rockafellar's classic 1970 book and other books that followed the Fenchel/Rockafellar formalism. For example, we treat differently closed set intersection theory and preservation of closure under linear transformations (Sections 1.4.2 and 1.4.3); we develop subdifferential calculus by using constrained optimization duality (Section 5.4.2); and we do not rely on concepts such as infimal convolution, image, polar sets and functions, bifunctions, and conjugate saddle functions. Perhaps our greatest departure is in duality theory itself: similar to Fenchel/Rockafellar, our development rests on Legendre/Fenchel conjugacy ideas, but is far more geometrical and visually intuitive.
Our duality framework is based on two simple geometrical problems: the min common point problem and the max crossing point problem. The salient feature of the min common/max crossing (MC/MC) framework is its highly visual geometry, through which all the core issues of duality theory become apparent and can be analyzed in a unified way. Our approach is to obtain a handful of broadly applicable theorems within the MC/MC framework, and then specialize them to particular types of problems (constrained optimization, Fenchel duality, minimax problems, etc). We address all duality questions (existence of duality gap, existence of dual optimal solutions, structure of the dual optimal solution set), and other issues (subdifferential theory, theorems of the alternative, duality gap estimates) in this way.
Fundamentally, the MC/MC framework is closely connected to the conjugacy framework, and owes its power and generality to this connection. However, the two frameworks offer complementary starting points for analysis and provide alternative views of the geometric foundation of duality: conjugacy emphasizes functional/algebraic descriptions, while MC/MC emphasizes set/epigraph descriptions. The MC/MC framework is simpler, and seems better suited for visualizing and investigating questions of strong duality and existence of dual optimal solutions. The conjugacy framework, with its emphasis on functional descriptions, is more suitable when mathematical operations on convex functions are involved, and the calculus of conjugate functions can be brought to bear for analysis or computation.
The book evolved from the earlier book of the author [BNO03] on the subject (coauthored with A. Nedic and A. Ozdaglar), but has different character and objectives. The 2003 book was quite extensive, was structured (at least in part) as a research monograph, and aimed to bridge the gap between convex and nonconvex optimization using concepts of nonsmooth analysis. By contrast, the present book is organized differently, has the character of a textbook, and concentrates exclusively on convex optimization. Despite the differences, the two books have similar style and level of mathematical sophistication, and share some material.
The chapter-by-chapter description of the book follows:
Chapter 1: This chapter develops all of the convex analysis tools that are needed for the development of duality theory in subsequent chapters. It covers basic algebraic concepts such as convex hulls and hyperplanes, and topological concepts such as relative interior, closure, preservation of closedness under linear transformations, and hyperplane separation. In addition, it develops subjects of special interest in duality and optimization, such as recession cones and conjugate functions.
Chapter 2: This chapter covers polyhedral convexity concepts: extreme points, the Farkas and Minkowski-Weyl theorems, and some of their applications in linear programming. It is not needed for the developments of subsequent chapters, and may be skipped at first reading.
Chapter 3: This chapter focuses on basic optimization concepts: types of minima, existence of solutions, and a few topics of special interest for duality theory, such as partial minimization and minimax theory.
Chapter 4: This chapter introduces the MC/MC duality framework. It discusses its connection with conjugacy theory, and it charts its applications to constrained optimization and minimax problems. It then develops broadly applicable theorems relating to strong duality and existence of dual optimal solutions.
Chapter 5: This chapter specializes the duality theorems of Chapter 4 to important contexts relating to linear programming, convex programming, and minimax theory. It also uses these theorems as an aid for the development of additional convex analysis tools, such as a powerful nonlinear version of Farkas' Lemma, subdifferential theory, and theorems of the alternative. A final section is devoted to nonconvex problems and estimates of the duality gap, with special focus on separable problems.
In aiming for brevity, I have omitted a number of topics that an instructor may wish for. One such omission is applications to specially structured problems; the book by Boyd and Vanderbergue [BoV04], as well as my book on parallel and distributed computation with John Tsitsiklis [BeT89] cover this material extensively (both books are available on line).
Another important omission is computational methods. However, I have written a long supplementary sixth chapter (over 100 pages), which covers the most popular convex optimization algorithms (and some new ones), and can be downloaded from the book's web page
http://www.athenasc.com/convexduality.html
This chapter, together with a more comprehensive treatment of convex analysis, optimization, duality, and algorithms will be part of a more extensive textbook that I am currently writing. Until that time, the chapter will serve instructors who wish to cover convex optimization algorithms in addition to duality (as I do in my M.I.T. course). This is a ``living" chapter that will be periodically updated. Its current contents are as follows:
Chapter 6 on Algorithms: 6.1. Problem Structures and Computational Approaches; 6.2. Algorithmic Descent; 6.3. Subgradient Methods; 6.4. Polyhedral Approximation Methods; 6.5. Proximal and Bundle Methods; 6.6. Dual Proximal Point Algorithms; 6.7. Interior Point Methods; 6.8. Approximate Subgradient Methods; 6.9. Optimal Algorithms and Complexity.
While I did not provide exercises in the text, I have supplied a substantial number of exercises (with detailed solutions) at the book's web page. The reader/instructor may also use the end-of-chapter problems (a total of 175) given in [BNO03], which have similar style and notation to the present book. Statements and detailed solutions of these problems can be downloaded from the book's web page and are also available on line at
http://www.athenasc.com/convexity.html
The book may be used as a text for a theoretical convex optimization course; I have taught several variants of such a course at MIT and elsewhere over the last ten years. It may also be used as a supplementary source for nonlinear programming classes, and as a theoretical foundation for classes focused on convex optimization models (rather than theory).
The book has been structured so that the reader/instructor can use the material selectively. For example, the polyhedral convexity material of Chapter 2 can be omitted in its entirety, as it is not used in Chapters 3-5. Similarly, the material on minimax theory (Sections 3.4, 4.2.5, and 5.5) may be omitted; and if this is done, Sections 3.3 and 5.3.4, which use the tools of partial minimization, may be omitted. Also, Sections 5.4-5.7 are ``terminal" and may each be omitted without effect on other sections.
A ``minimal" self-contained selection, which I have used in my nonlinear programming class at MIT (together with the supplementary web-based Chapter 6 on algorithms), consists of the following:
Chapter 1, except for Sections 1.3.3 and 1.4.1.
Section 3.1.
Chapter 4, except for Section 4.2.5.
Chapter 5, except for Sections 5.2, 5.3.4, and 5.5-5.7.
This selection focuses on nonlinear convex optimization, and excludes all the material relating to polyhedral convexity and minimax theory.
I would like to express my thanks to several colleagues for their contributions to the book. My collaboration with Angelia Nedic and Asuman Ozdaglar on our 2003 book was important in laying the foundations of the present book. Huizhen (Janey) Yu read carefully early drafts of portions of the book, and offered several insightful suggestions. Paul Tseng contributed substantially through our joint research on set intersection theory, given in part in Section 1.4.2 (this research was motivated by earlier collaboration with Angelia Nedic). Feedback from students and colleagues, including Dimitris Bisias, Vivek Borkar, John Tsitsiklis, Mengdi Wang, and Yunjian Xu, is highly appreciated. Finally, I wish to thank the many outstanding students in my classes, who have been a continuing source of motivation and inspiration.