A Simple Canonical Form for Nonlinear Programming Problems and Its Use | Journal of Optimization Theory and Applications Skip to main content
Log in

A Simple Canonical Form for Nonlinear Programming Problems and Its Use

  • Published:
Journal of Optimization Theory and Applications Aims and scope Submit manuscript

Abstract

We argue that reducing nonlinear programming problems to a simple canonical form is an effective way to analyze them, specially when the gradients of the constraints are linearly dependent. To illustrate this fact, we solve an open problem about constraint qualifications using this canonical form.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Behling, R., Haeser, G., Ramos, A., Viana, D.: On a conjecture in second-order optimality conditions (2017). arXiv:1706.07833v1 [math.OC]

  2. Guckenheimer, J., Holmes, P.: Nonlinear Oscillations, Dynamical Systems, and Bifurcations of Vector Fields. Springer, Berlin (1983)

    Book  MATH  Google Scholar 

  3. Perko, L.: Differential Equations and Dynamical Systems. Springer, New York (1991)

    Book  MATH  Google Scholar 

  4. Mascarenhas, W.F.: The affine scaling algorithm fails for stepsize 0.999. SIAM J. Optim. 7, 34–46 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  5. Mascarenhas, W.F.: Newton’s iterates can converge to non-stationary points. Math. Program. 112, 327–334 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  6. Mascarenhas, W.F.: The BFGS algorithm with exact line searches fails for nonconvex functions. Math. Program. 99, 49–61 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  7. Mascarenhas, W.F.: On the divergence of line search methods. Comput. Appl. Math. 26, 129–169 (2007)

    MathSciNet  MATH  Google Scholar 

  8. Mascarenhas, W.F.: The divergence of the BFGS and Gauss Newton methods. Math. Program. 147, 253–276 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  9. Andreani, R., Martínez, J.M., Schuverdt, M.L.: On second-order optimality conditions for nonlinear programming. Optimization 56, 529–542 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  10. Spivak, M.: Calculus on Manifolds, A Modern Approach to Classical Theorems of Advanced Calculus. Addison-Wesley, Boston (1965)

    MATH  Google Scholar 

  11. Haeser, G.: An extension of Yuan’s lemma and its applications in optimization. J. Optim. Theory Appl. 174, 641–649 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  12. Dines, L.L.: On the mapping of quadratic forms. Bull. Am. Math. Soc. 47, 494–498 (1941)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Walter F. Mascarenhas.

Appendix

Appendix

In this appendix, we prove the results involving Linear Algebra used in Sect. 5. The proof of Lemma 5.1 is based on the next two lemmas:

Lemma A.1

(Rank one columns) If the symmetric matrices \(\mathbf {H}_1,\dots , \mathbf {H}_m \in {\mathbb {R}}^{n \times n}\) are such that the \(n \times m\) matrix

$$\begin{aligned} \mathbf {A}_{\mathbf {v}} := \left( \mathbf {H}_j \mathbf {v}, \dots , \mathbf {H}_m \mathbf {v} \right) \end{aligned}$$

has rank at most one for all \(\mathbf {v} \in {\mathbb {R}}^{n}\), then there exist \(\alpha _1,\dots \alpha _m \in \mathbb {R}\) and a symmetric matrix \(\mathbf {H} \in {\mathbb {R}}^{n \times n} \setminus {\left\{ 0 \right\} }\) such that \(\mathbf {H}_j = \alpha _j \mathbf {H}\) for \(j = 1 \dots m\).

Lemma A.2

(Directional derivatives of rank one) Let \(\mathcal {A}\) be a neighborhood of \(0 \in {\mathbb {R}}^{n}\) and let \(h: \mathcal {A} \rightarrow {\mathbb {R}}^{n \times m}\) be a function of class \(C^1\), and for \(1 \le \ell \le m\) let \({h_\ell }\!\left( \mathbf {x}\right) \) be the \(\ell \)th column of \({h}\!\left( \mathbf {x}\right) \). If \({h}\!\left( 0\right) = 0\) and \({\mathrm {rank}}\!\left( {h}\!\left( \mathbf {x}\right) \right) \le 1\) for all \(\mathbf {x} \in \mathcal {A}\), then for every \(\mathbf {v} \in {\mathbb {R}}^{n}\) the \(n \times m\) matrix

$$\begin{aligned} \mathbf {A}_{\mathbf {v}} := \left( {\mathrm {D}h_1}\!\left( \mathbf {x}\right) \mathbf {v}, \dots , {\mathrm {D}h_m}\!\left( \mathbf {x}\right) \mathbf {v} \right) \end{aligned}$$

has rank at most one.

The proof of Lemma 5.2 uses the following auxiliary results:

Theorem A.1

(Dines’ Theorem) If the symmetric matrices \(\mathbf {A}, \mathbf {B} \in {\mathbb {R}}^{n \times n}\) are such that for all \(\mathbf {x} \in {\mathbb {R}}^{n} \setminus {\left\{ 0 \right\} }\) either \(\mathbf {x}^{\mathrm {T}{}} \mathbf {A} \mathbf {x} \ne 0\) or \(\mathbf {x}^{\mathrm {T}{}} \mathbf {B} \mathbf {x} \ne 0\), then the set

$$\begin{aligned} {\mathcal {R}}\!\left( \mathbf {A},\mathbf {B}\right) := {\left\{ \left( \mathbf {x}^{\mathrm {T}{}} \mathbf {A} \mathbf {x}, \, \mathbf {x}^{\mathrm {T}{}} \mathbf {B} \mathbf {x} \right) : \mathbf {x} \in {\mathbb {R}}^{n} \right\} } \subset {\mathbb {R}}^{2} \end{aligned}$$
(25)

is either \({\mathbb {R}}^{2}\) itself or a closed angular sector of angle less than \(\pi \).

Lemma A.3

(Definite separation) Let \(\mathcal {I} \subset \mathbb {R}{}\) be a compact interval. If the symmetric matrices \(\mathbf {A}, \mathbf {B} \in {\mathbb {R}}^{n \times n}\) are such that for all \(\mathbf {x} \in {\mathbb {R}}^{n} \setminus {\left\{ 0 \right\} }\) there exists \(\gamma _x \in \mathcal {I}\) such that \(\mathbf {x}^{\mathrm {T}{}} \left( \mathbf {A} + \gamma _x \mathbf {B} \right) \mathbf {x} > 0\), then there exists \(\gamma ^* \in \mathcal {I}\) such that

$$\begin{aligned} \mathbf {x}^{\mathrm {T}{}} \left( \mathbf {A} + \gamma ^* \mathbf {B} \right) \mathbf {x} > 0 \end{aligned}$$

for all \(\mathbf {x} \in {\mathbb {R}}^{n} \setminus {\left\{ 0 \right\} }\).

Dines’ Theorem is proved in [12], and in the rest of this appendix we prove our lemmas in the order in which they were stated.

Proof of Lemma 5.1

Applying Lemma A.2 to the function \({h}\!\left( \mathbf {x}\right) := {Dc}\!\left( \mathbf {x}\right) \) we conclude that for every \(\mathbf {v} \in {\mathbb {R}}^{n}\) the \(n \times m\) matrix

$$\begin{aligned} \mathbf {A}_{\mathbf {v}} := \left( {\nabla ^2 \! c_1 \! \left( 0 \right) } \mathbf {v}, \dots , {\nabla ^2 \! c_m \! \left( 0 \right) } \mathbf {v} \right) \end{aligned}$$

has rank at most one, and Lemma A.1 yields the coefficients \(\alpha _j\) and the matrix \(\mathbf {H}\). \(\square \)

Proof of Lemma 5.2

For every \(k \in \mathbb {N}{}\), Lemma A.3 yields \(\gamma _k \in \mathcal {I}\) such that

$$\begin{aligned} \mathbf {x}^{\mathrm {T}{}} \left( \mathbf {A} + \gamma _k \mathbf {B} + \frac{1}{k} \mathbf {I}_{n \times n} \right) \mathbf {x} > 0 \end{aligned}$$

for all \(\mathbf {x} \in {\mathbb {R}}^{n}\). Since the sequence \(\gamma _k\) is bounded, it has a subsequence which converges to some \(\gamma ^* \in \mathcal {I}\). This \(\gamma ^*\) is as required by Lemma 5.2. \(\square \)

Proof of Lemma A.1

The Lemma holds when \(n = 1\) or \(m = 1\). Let us then assume that it holds when for \(n - 1 \ge 1\) or \(m - 1 \ge 1\) and show that it also holds for m and n. If some \(\mathbf {H}_j\) is zero then we can take \(\alpha _j = 0\) and use induction for

$$\begin{aligned} \mathbf {H}_1,\dots ,\mathbf {H}_{j-1}, \mathbf {H}_{j+1},\dots \mathbf {H}_m. \end{aligned}$$

Therefore, we can assume that \(\mathbf {H}_j \ne 0\) for all j. It follows that \(\mathbf {H}_1\) has an eigenvalue decomposition \(\mathbf {H}_1 = \mathbf {Q} \mathbf {D} \mathbf {Q}^{\mathrm {T}{}}\) with \(d_{11} \ne 0\). By replacing \(\mathbf {H}_j\) by \(\mathbf {Q}^{\mathrm {T}{}} \mathbf {H}_j \mathbf {Q}\) for all j, we can assume that \(\mathbf {Q} = \mathbf {I}_{n \times n}\). Taking \(\mathbf {v} = 1/d_{11} \mathbf {e}_1\), we obtain that \(\mathbf {H}_1 \mathbf {v} = \mathbf {e}_1\) and the hypothesis that the matrix \(\mathbf {A}_{\mathbf {v}}\) has rank one implies that \(\mathbf {H}_j \mathbf {v} = \alpha _j \mathbf {e}_1\) for all j. Since the matrices \(\mathbf {H}_j\) are symmetric, all of them have the form

$$\begin{aligned} \mathbf {H}_j = \left( \begin{array}{cc} \alpha _j &{}\quad 0_{n -1}^{\mathrm {T}{}} \\ 0_{n -1} &{}\quad \mathbf {H}'_j \end{array} \right) , \end{aligned}$$

for symmetric matrices \(\mathbf {H}_j' \in {\mathbb {R}}^{\left( n-1 \right) \times \left( n-1 \right) } \setminus {\left\{ 0 \right\} }\) which satisfy the hypothesis of Lemma A.1 with \(n = n - 1\). Therefore, there exists a matrix \(\mathbf {H}' \ne 0\) and \(\alpha _1',\dots \alpha _m'\) such that \(\mathbf {H}_j' = \alpha _j' \mathbf {H}'\) for \(j = 1,\dots ,m\). Since \(\mathbf {H}_1 \mathbf {e}_1 = d_{11} \mathbf {e}_1 = \alpha _1 \mathbf {e}_1 \ne 0\), we have that \(\alpha _1 \ne 0\). Let then \(\mathbf {v}' \in {\mathbb {R}}^{n -1}\) be such that \(\mathbf {H}' \mathbf {v}' \ne 0\), and write

$$\begin{aligned} \mathbf {v} = \left( \begin{array}{c} 1 \\ \mathbf {v}' \end{array} \right) . \end{aligned}$$

We have that

$$\begin{aligned} \mathbf {H}_1 \mathbf {v} = \alpha _1 \left( \begin{array}{c} 1 \\ \frac{\alpha _1'}{\alpha _1} \mathbf {H}' \mathbf {v}' \end{array} \right) \quad \mathrm {and} \quad \mathbf {H}_j \mathbf {v} = \left( \begin{array}{c} \alpha _j \\ \alpha _j' \mathbf {H}' \mathbf {v}' \end{array} \right) \quad \mathrm {for} \ j > 1, \end{aligned}$$

and the vectors \(\mathbf {H}_1 \mathbf {v}\) and \(\mathbf {H}_j \mathbf {v}\) are aligned by hypothesis. Moreover, either \(\alpha _j \ne 0\) or \(\alpha _j' \ne 0\), because \(\mathbf {H}_j \ne 0\). It follows that \(\alpha _j' = \alpha _1' \alpha _j / \alpha _1\) for all j. As a result, \(\mathbf {H}_j = \alpha _j \mathbf {H}\), where

$$\begin{aligned} \mathbf {H} := \left( \begin{array}{cc} 1 &{}\quad 0^{\mathrm {T}{}}_{n-1} \\ 0_{n-1} &{}\quad \frac{\alpha _1'}{\alpha _1} \mathbf {H}' \end{array} \right) \end{aligned}$$

and we are done. \(\square \)

Proof of Lemma A.2

For every \(\mathbf {v} \in {\mathbb {R}}^{n}\) and \(1 \le \ell \le m\), the facts that \(h \in C^1\) and \({h}\!\left( 0\right) = 0\) imply that

$$\begin{aligned} \lim _{\delta \rightarrow 0} \frac{1}{\delta } {h_\ell }\!\left( \delta \mathbf {v}\right) = {\mathrm {D}h_\ell }\!\left( 0\right) \mathbf {v} = \mathbf {A}_{\mathbf {v}}. \end{aligned}$$

Let \(\rho > 0\) be such that if \(\left\| \mathbf {B} - \mathbf {A}_{\mathbf {v}} \right\| \le \rho \) then \({\mathrm {rank}}\!\left( \mathbf {B}\right) \ge {\mathrm {rank}}\!\left( \mathbf {A_v}\right) \). Taking \(\delta \) such that

$$\begin{aligned} \left\| \frac{1}{\delta } {h}\!\left( \delta \mathbf {v}\right) - \mathbf {A}_{\mathbf {v}} \right\| < \rho \end{aligned}$$

we obtain that

$$\begin{aligned} 1 \ge {\mathrm {rank}}\!\left( {h}\!\left( \delta \mathbf {v}\right) \right) = {\mathrm {rank}}\!\left( \frac{1}{\delta } {h}\!\left( \delta \mathbf {v}\right) \right) \ge {\mathrm {rank}}\!\left( \mathbf {A}_{\mathbf {v}}\right) . \end{aligned}$$

Therefore, \({\mathrm {rank}}\!\left( \mathbf {A}_{\mathbf {v}}\right) \le 1\), and we are done. \(\square \)

Proof of Lemma A.3

Let us write \(\mathcal {I} = [a,b]\). If \(a = b\) then we could simply take \(\gamma ^* = a\) and we would be done. Therefore, we can assume that \(a < b\). If \(\left( u,v \right) \in {\mathcal {R}}\!\left( \mathbf {A},\mathbf {B}\right) \) then \( u + \gamma v \ge 0\) for some \(\gamma \in [a,b]\). This implies that

$$\begin{aligned} u \ge -\left| \gamma \right| \left| v\right| \ge - \left( 1 + \left| a\right| + \left| b\right| \right) \left| v\right| \end{aligned}$$

Therefore, \(\left( - 2 \left( 1 + \left| a\right| + \left| b\right| \right) , 1 \right) \not \in {\mathcal {R}}\!\left( \mathbf {A},\mathbf {B}\right) \) and by Dines’ Theorem \({\mathcal {R}}\!\left( \mathbf {A},\mathbf {B}\right) \) is a closed pointed cone. For each \(\gamma \in \mathcal {I}\), the set

$$\begin{aligned} \mathcal {C}_\gamma := {\left\{ \left( u, v \right) \in {\mathbb {R}}^{2} : u + \gamma v \le 0 \right\} } \end{aligned}$$
(26)

is a closed convex cone, and

$$\begin{aligned} \mathcal {C} := \bigcap _{\gamma \in \mathcal {I}} \mathcal {C}_\gamma = \bigcap _{\gamma \in [a,b]} \mathcal {C}_\gamma \end{aligned}$$

is also a closed cone. Moreover, \(\mathcal {C}\) is pointed because \(a < b\). The hypothesis tells us that for every \(\left( u,v \right) \in {\mathcal {R}}\!\left( \mathbf {A},\mathbf {B}\right) \setminus {\left\{ (0,0) \right\} }\) there exists \(\gamma \in \mathcal {I}\) such that \(\left( u,v \right) \not \in \mathcal {C}_\gamma \), and this implies that \(\mathcal {C} \cap {\mathcal {R}}\!\left( \mathbf {A},\mathbf {B}\right) = {\left\{ \left( 0,0 \right) \right\} }\). Therefore, there exists \((c,d) \in {\mathbb {R}}^{2}\) such that

$$\begin{aligned} c u + d v > 0 \quad \mathrm {for} \; (u,v) \in {\mathcal {R}}\!\left( \mathbf {A},\mathbf {B}\right) \setminus {\left\{ \left( 0,0 \right) \right\} } \end{aligned}$$
(27)

and

$$\begin{aligned} c u + d v < 0 \quad \mathrm {for} \; (u,v) \in \mathcal {C} \setminus {\left\{ \left( 0,0 \right) \right\} }. \end{aligned}$$
(28)

Equation (26) shows that \((-1,0) \in \mathcal {C}_\gamma \) for all \(\gamma \). Therefore, \((-1,0) \in \mathcal {C}\) and Eq. (28) implies that \(c > 0\), and by dividing Eqs. (27) and (28) by c if necessary, we can assume that \(c = 1\). The point \(\left( a,-1 \right) \) belongs to \(\mathcal {C}_\gamma \) for all \(\gamma \ge a\). Therefore, \((a,-1) \in \mathcal {C}\) and Eq. (28) implies that \(a - d < 0\), that is \(d > a\). Similarly, The point \((-b,1)\) belongs to \(\mathcal {C}_\gamma \) for all \(\gamma \le b\), and \(-b + d < 0\), that is, \(d < b\). In summary, we have that \(d \in [a,b]\). Finally, we take \(\gamma ^* = d\), and Eq. (27) with \(c = 1\) shows that this is a valid choice. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mascarenhas, W.F. A Simple Canonical Form for Nonlinear Programming Problems and Its Use. J Optim Theory Appl 181, 456–469 (2019). https://doi.org/10.1007/s10957-018-1381-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10957-018-1381-7

Keywords

Mathematics Subject Classification

Navigation