Abstract
We argue that reducing nonlinear programming problems to a simple canonical form is an effective way to analyze them, specially when the gradients of the constraints are linearly dependent. To illustrate this fact, we solve an open problem about constraint qualifications using this canonical form.
Similar content being viewed by others
References
Behling, R., Haeser, G., Ramos, A., Viana, D.: On a conjecture in second-order optimality conditions (2017). arXiv:1706.07833v1 [math.OC]
Guckenheimer, J., Holmes, P.: Nonlinear Oscillations, Dynamical Systems, and Bifurcations of Vector Fields. Springer, Berlin (1983)
Perko, L.: Differential Equations and Dynamical Systems. Springer, New York (1991)
Mascarenhas, W.F.: The affine scaling algorithm fails for stepsize 0.999. SIAM J. Optim. 7, 34–46 (1997)
Mascarenhas, W.F.: Newton’s iterates can converge to non-stationary points. Math. Program. 112, 327–334 (2008)
Mascarenhas, W.F.: The BFGS algorithm with exact line searches fails for nonconvex functions. Math. Program. 99, 49–61 (2004)
Mascarenhas, W.F.: On the divergence of line search methods. Comput. Appl. Math. 26, 129–169 (2007)
Mascarenhas, W.F.: The divergence of the BFGS and Gauss Newton methods. Math. Program. 147, 253–276 (2014)
Andreani, R., Martínez, J.M., Schuverdt, M.L.: On second-order optimality conditions for nonlinear programming. Optimization 56, 529–542 (2007)
Spivak, M.: Calculus on Manifolds, A Modern Approach to Classical Theorems of Advanced Calculus. Addison-Wesley, Boston (1965)
Haeser, G.: An extension of Yuan’s lemma and its applications in optimization. J. Optim. Theory Appl. 174, 641–649 (2017)
Dines, L.L.: On the mapping of quadratic forms. Bull. Am. Math. Soc. 47, 494–498 (1941)
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
In this appendix, we prove the results involving Linear Algebra used in Sect. 5. The proof of Lemma 5.1 is based on the next two lemmas:
Lemma A.1
(Rank one columns) If the symmetric matrices \(\mathbf {H}_1,\dots , \mathbf {H}_m \in {\mathbb {R}}^{n \times n}\) are such that the \(n \times m\) matrix
has rank at most one for all \(\mathbf {v} \in {\mathbb {R}}^{n}\), then there exist \(\alpha _1,\dots \alpha _m \in \mathbb {R}\) and a symmetric matrix \(\mathbf {H} \in {\mathbb {R}}^{n \times n} \setminus {\left\{ 0 \right\} }\) such that \(\mathbf {H}_j = \alpha _j \mathbf {H}\) for \(j = 1 \dots m\).
Lemma A.2
(Directional derivatives of rank one) Let \(\mathcal {A}\) be a neighborhood of \(0 \in {\mathbb {R}}^{n}\) and let \(h: \mathcal {A} \rightarrow {\mathbb {R}}^{n \times m}\) be a function of class \(C^1\), and for \(1 \le \ell \le m\) let \({h_\ell }\!\left( \mathbf {x}\right) \) be the \(\ell \)th column of \({h}\!\left( \mathbf {x}\right) \). If \({h}\!\left( 0\right) = 0\) and \({\mathrm {rank}}\!\left( {h}\!\left( \mathbf {x}\right) \right) \le 1\) for all \(\mathbf {x} \in \mathcal {A}\), then for every \(\mathbf {v} \in {\mathbb {R}}^{n}\) the \(n \times m\) matrix
has rank at most one.
The proof of Lemma 5.2 uses the following auxiliary results:
Theorem A.1
(Dines’ Theorem) If the symmetric matrices \(\mathbf {A}, \mathbf {B} \in {\mathbb {R}}^{n \times n}\) are such that for all \(\mathbf {x} \in {\mathbb {R}}^{n} \setminus {\left\{ 0 \right\} }\) either \(\mathbf {x}^{\mathrm {T}{}} \mathbf {A} \mathbf {x} \ne 0\) or \(\mathbf {x}^{\mathrm {T}{}} \mathbf {B} \mathbf {x} \ne 0\), then the set
is either \({\mathbb {R}}^{2}\) itself or a closed angular sector of angle less than \(\pi \).
Lemma A.3
(Definite separation) Let \(\mathcal {I} \subset \mathbb {R}{}\) be a compact interval. If the symmetric matrices \(\mathbf {A}, \mathbf {B} \in {\mathbb {R}}^{n \times n}\) are such that for all \(\mathbf {x} \in {\mathbb {R}}^{n} \setminus {\left\{ 0 \right\} }\) there exists \(\gamma _x \in \mathcal {I}\) such that \(\mathbf {x}^{\mathrm {T}{}} \left( \mathbf {A} + \gamma _x \mathbf {B} \right) \mathbf {x} > 0\), then there exists \(\gamma ^* \in \mathcal {I}\) such that
for all \(\mathbf {x} \in {\mathbb {R}}^{n} \setminus {\left\{ 0 \right\} }\).
Dines’ Theorem is proved in [12], and in the rest of this appendix we prove our lemmas in the order in which they were stated.
Proof of Lemma 5.1
Applying Lemma A.2 to the function \({h}\!\left( \mathbf {x}\right) := {Dc}\!\left( \mathbf {x}\right) \) we conclude that for every \(\mathbf {v} \in {\mathbb {R}}^{n}\) the \(n \times m\) matrix
has rank at most one, and Lemma A.1 yields the coefficients \(\alpha _j\) and the matrix \(\mathbf {H}\). \(\square \)
Proof of Lemma 5.2
For every \(k \in \mathbb {N}{}\), Lemma A.3 yields \(\gamma _k \in \mathcal {I}\) such that
for all \(\mathbf {x} \in {\mathbb {R}}^{n}\). Since the sequence \(\gamma _k\) is bounded, it has a subsequence which converges to some \(\gamma ^* \in \mathcal {I}\). This \(\gamma ^*\) is as required by Lemma 5.2. \(\square \)
Proof of Lemma A.1
The Lemma holds when \(n = 1\) or \(m = 1\). Let us then assume that it holds when for \(n - 1 \ge 1\) or \(m - 1 \ge 1\) and show that it also holds for m and n. If some \(\mathbf {H}_j\) is zero then we can take \(\alpha _j = 0\) and use induction for
Therefore, we can assume that \(\mathbf {H}_j \ne 0\) for all j. It follows that \(\mathbf {H}_1\) has an eigenvalue decomposition \(\mathbf {H}_1 = \mathbf {Q} \mathbf {D} \mathbf {Q}^{\mathrm {T}{}}\) with \(d_{11} \ne 0\). By replacing \(\mathbf {H}_j\) by \(\mathbf {Q}^{\mathrm {T}{}} \mathbf {H}_j \mathbf {Q}\) for all j, we can assume that \(\mathbf {Q} = \mathbf {I}_{n \times n}\). Taking \(\mathbf {v} = 1/d_{11} \mathbf {e}_1\), we obtain that \(\mathbf {H}_1 \mathbf {v} = \mathbf {e}_1\) and the hypothesis that the matrix \(\mathbf {A}_{\mathbf {v}}\) has rank one implies that \(\mathbf {H}_j \mathbf {v} = \alpha _j \mathbf {e}_1\) for all j. Since the matrices \(\mathbf {H}_j\) are symmetric, all of them have the form
for symmetric matrices \(\mathbf {H}_j' \in {\mathbb {R}}^{\left( n-1 \right) \times \left( n-1 \right) } \setminus {\left\{ 0 \right\} }\) which satisfy the hypothesis of Lemma A.1 with \(n = n - 1\). Therefore, there exists a matrix \(\mathbf {H}' \ne 0\) and \(\alpha _1',\dots \alpha _m'\) such that \(\mathbf {H}_j' = \alpha _j' \mathbf {H}'\) for \(j = 1,\dots ,m\). Since \(\mathbf {H}_1 \mathbf {e}_1 = d_{11} \mathbf {e}_1 = \alpha _1 \mathbf {e}_1 \ne 0\), we have that \(\alpha _1 \ne 0\). Let then \(\mathbf {v}' \in {\mathbb {R}}^{n -1}\) be such that \(\mathbf {H}' \mathbf {v}' \ne 0\), and write
We have that
and the vectors \(\mathbf {H}_1 \mathbf {v}\) and \(\mathbf {H}_j \mathbf {v}\) are aligned by hypothesis. Moreover, either \(\alpha _j \ne 0\) or \(\alpha _j' \ne 0\), because \(\mathbf {H}_j \ne 0\). It follows that \(\alpha _j' = \alpha _1' \alpha _j / \alpha _1\) for all j. As a result, \(\mathbf {H}_j = \alpha _j \mathbf {H}\), where
and we are done. \(\square \)
Proof of Lemma A.2
For every \(\mathbf {v} \in {\mathbb {R}}^{n}\) and \(1 \le \ell \le m\), the facts that \(h \in C^1\) and \({h}\!\left( 0\right) = 0\) imply that
Let \(\rho > 0\) be such that if \(\left\| \mathbf {B} - \mathbf {A}_{\mathbf {v}} \right\| \le \rho \) then \({\mathrm {rank}}\!\left( \mathbf {B}\right) \ge {\mathrm {rank}}\!\left( \mathbf {A_v}\right) \). Taking \(\delta \) such that
we obtain that
Therefore, \({\mathrm {rank}}\!\left( \mathbf {A}_{\mathbf {v}}\right) \le 1\), and we are done. \(\square \)
Proof of Lemma A.3
Let us write \(\mathcal {I} = [a,b]\). If \(a = b\) then we could simply take \(\gamma ^* = a\) and we would be done. Therefore, we can assume that \(a < b\). If \(\left( u,v \right) \in {\mathcal {R}}\!\left( \mathbf {A},\mathbf {B}\right) \) then \( u + \gamma v \ge 0\) for some \(\gamma \in [a,b]\). This implies that
Therefore, \(\left( - 2 \left( 1 + \left| a\right| + \left| b\right| \right) , 1 \right) \not \in {\mathcal {R}}\!\left( \mathbf {A},\mathbf {B}\right) \) and by Dines’ Theorem \({\mathcal {R}}\!\left( \mathbf {A},\mathbf {B}\right) \) is a closed pointed cone. For each \(\gamma \in \mathcal {I}\), the set
is a closed convex cone, and
is also a closed cone. Moreover, \(\mathcal {C}\) is pointed because \(a < b\). The hypothesis tells us that for every \(\left( u,v \right) \in {\mathcal {R}}\!\left( \mathbf {A},\mathbf {B}\right) \setminus {\left\{ (0,0) \right\} }\) there exists \(\gamma \in \mathcal {I}\) such that \(\left( u,v \right) \not \in \mathcal {C}_\gamma \), and this implies that \(\mathcal {C} \cap {\mathcal {R}}\!\left( \mathbf {A},\mathbf {B}\right) = {\left\{ \left( 0,0 \right) \right\} }\). Therefore, there exists \((c,d) \in {\mathbb {R}}^{2}\) such that
and
Equation (26) shows that \((-1,0) \in \mathcal {C}_\gamma \) for all \(\gamma \). Therefore, \((-1,0) \in \mathcal {C}\) and Eq. (28) implies that \(c > 0\), and by dividing Eqs. (27) and (28) by c if necessary, we can assume that \(c = 1\). The point \(\left( a,-1 \right) \) belongs to \(\mathcal {C}_\gamma \) for all \(\gamma \ge a\). Therefore, \((a,-1) \in \mathcal {C}\) and Eq. (28) implies that \(a - d < 0\), that is \(d > a\). Similarly, The point \((-b,1)\) belongs to \(\mathcal {C}_\gamma \) for all \(\gamma \le b\), and \(-b + d < 0\), that is, \(d < b\). In summary, we have that \(d \in [a,b]\). Finally, we take \(\gamma ^* = d\), and Eq. (27) with \(c = 1\) shows that this is a valid choice. \(\square \)
Rights and permissions
About this article
Cite this article
Mascarenhas, W.F. A Simple Canonical Form for Nonlinear Programming Problems and Its Use. J Optim Theory Appl 181, 456–469 (2019). https://doi.org/10.1007/s10957-018-1381-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10957-018-1381-7