A Simple Canonical Form for Nonlinear Programming Problems and Its Use

Mascarenhas, Walter F.

doi:10.1007/s10957-018-1381-7

A Simple Canonical Form for Nonlinear Programming Problems and Its Use

Published: 22 August 2018

Volume 181, pages 456–469, (2019)
Cite this article

Journal of Optimization Theory and Applications Aims and scope Submit manuscript

Walter F. Mascarenhas ORCID: orcid.org/0000-0002-5943-3228¹

265 Accesses
2 Citations
Explore all metrics

Abstract

We argue that reducing nonlinear programming problems to a simple canonical form is an effective way to analyze them, specially when the gradients of the constraints are linearly dependent. To illustrate this fact, we solve an open problem about constraint qualifications using this canonical form.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

On canonical duality theory and constrained optimization problems

Article 23 April 2021

Strong Duality for General Quadratic Programs with Quadratic Equality Constraints

Article 24 August 2022

The Role of grossone in Nonlinear Programming and Exact Penalty Methods

References

Behling, R., Haeser, G., Ramos, A., Viana, D.: On a conjecture in second-order optimality conditions (2017). arXiv:1706.07833v1 [math.OC]
Guckenheimer, J., Holmes, P.: Nonlinear Oscillations, Dynamical Systems, and Bifurcations of Vector Fields. Springer, Berlin (1983)
Book MATH Google Scholar
Perko, L.: Differential Equations and Dynamical Systems. Springer, New York (1991)
Book MATH Google Scholar
Mascarenhas, W.F.: The affine scaling algorithm fails for stepsize 0.999. SIAM J. Optim. 7, 34–46 (1997)
Article MathSciNet MATH Google Scholar
Mascarenhas, W.F.: Newton’s iterates can converge to non-stationary points. Math. Program. 112, 327–334 (2008)
Article MathSciNet MATH Google Scholar
Mascarenhas, W.F.: The BFGS algorithm with exact line searches fails for nonconvex functions. Math. Program. 99, 49–61 (2004)
Article MathSciNet MATH Google Scholar
Mascarenhas, W.F.: On the divergence of line search methods. Comput. Appl. Math. 26, 129–169 (2007)
MathSciNet MATH Google Scholar
Mascarenhas, W.F.: The divergence of the BFGS and Gauss Newton methods. Math. Program. 147, 253–276 (2014)
Article MathSciNet MATH Google Scholar
Andreani, R., Martínez, J.M., Schuverdt, M.L.: On second-order optimality conditions for nonlinear programming. Optimization 56, 529–542 (2007)
Article MathSciNet MATH Google Scholar
Spivak, M.: Calculus on Manifolds, A Modern Approach to Classical Theorems of Advanced Calculus. Addison-Wesley, Boston (1965)
MATH Google Scholar
Haeser, G.: An extension of Yuan’s lemma and its applications in optimization. J. Optim. Theory Appl. 174, 641–649 (2017)
Article MathSciNet MATH Google Scholar
Dines, L.L.: On the mapping of quadratic forms. Bull. Am. Math. Soc. 47, 494–498 (1941)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Universidade de São Paulo, São Paulo, Brazil
Walter F. Mascarenhas

Authors

Walter F. Mascarenhas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Walter F. Mascarenhas.

Appendix

In this appendix, we prove the results involving Linear Algebra used in Sect. 5. The proof of Lemma 5.1 is based on the next two lemmas:

Lemma A.1

(Rank one columns) If the symmetric matrices $\mathbf {H}_1,\dots , \mathbf {H}_m \in {\mathbb {R}}^{n \times n}$ are such that the $n \times m$ matrix

$$\begin{aligned} \mathbf {A}_{\mathbf {v}} := \left( \mathbf {H}_j \mathbf {v}, \dots , \mathbf {H}_m \mathbf {v} \right) \end{aligned}$$

has rank at most one for all $\mathbf {v} \in {\mathbb {R}}^{n}$, then there exist $\alpha _1,\dots \alpha _m \in \mathbb {R}$ and a symmetric matrix $\mathbf {H} \in {\mathbb {R}}^{n \times n} \setminus {\left\{ 0 \right\} }$ such that $\mathbf {H}_j = \alpha _j \mathbf {H}$ for $j = 1 \dots m$.

Lemma A.2

(Directional derivatives of rank one) Let $\mathcal {A}$ be a neighborhood of $0 \in {\mathbb {R}}^{n}$ and let $h: \mathcal {A} \rightarrow {\mathbb {R}}^{n \times m}$ be a function of class $C^1$, and for $1 \le \ell \le m$ let ${h_\ell }\!\left( \mathbf {x}\right) $ be the $\ell $th column of ${h}\!\left( \mathbf {x}\right) $. If ${h}\!\left( 0\right) = 0$ and ${\mathrm {rank}}\!\left( {h}\!\left( \mathbf {x}\right) \right) \le 1$ for all $\mathbf {x} \in \mathcal {A}$, then for every $\mathbf {v} \in {\mathbb {R}}^{n}$ the $n \times m$ matrix

$$\begin{aligned} \mathbf {A}_{\mathbf {v}} := \left( {\mathrm {D}h_1}\!\left( \mathbf {x}\right) \mathbf {v}, \dots , {\mathrm {D}h_m}\!\left( \mathbf {x}\right) \mathbf {v} \right) \end{aligned}$$

has rank at most one.

The proof of Lemma 5.2 uses the following auxiliary results:

Theorem A.1

(Dines’ Theorem) If the symmetric matrices $\mathbf {A}, \mathbf {B} \in {\mathbb {R}}^{n \times n}$ are such that for all $\mathbf {x} \in {\mathbb {R}}^{n} \setminus {\left\{ 0 \right\} }$ either $\mathbf {x}^{\mathrm {T}{}} \mathbf {A} \mathbf {x} \ne 0$ or $\mathbf {x}^{\mathrm {T}{}} \mathbf {B} \mathbf {x} \ne 0$, then the set

$$\begin{aligned} {\mathcal {R}}\!\left( \mathbf {A},\mathbf {B}\right) := {\left\{ \left( \mathbf {x}^{\mathrm {T}{}} \mathbf {A} \mathbf {x}, \, \mathbf {x}^{\mathrm {T}{}} \mathbf {B} \mathbf {x} \right) : \mathbf {x} \in {\mathbb {R}}^{n} \right\} } \subset {\mathbb {R}}^{2} \end{aligned}$$

(25)

is either ${\mathbb {R}}^{2}$ itself or a closed angular sector of angle less than $\pi $.

Lemma A.3

(Definite separation) Let $\mathcal {I} \subset \mathbb {R}{}$ be a compact interval. If the symmetric matrices $\mathbf {A}, \mathbf {B} \in {\mathbb {R}}^{n \times n}$ are such that for all $\mathbf {x} \in {\mathbb {R}}^{n} \setminus {\left\{ 0 \right\} }$ there exists $\gamma _x \in \mathcal {I}$ such that $\mathbf {x}^{\mathrm {T}{}} \left( \mathbf {A} + \gamma _x \mathbf {B} \right) \mathbf {x} > 0$, then there exists $\gamma ^* \in \mathcal {I}$ such that

$$\begin{aligned} \mathbf {x}^{\mathrm {T}{}} \left( \mathbf {A} + \gamma ^* \mathbf {B} \right) \mathbf {x} > 0 \end{aligned}$$

for all $\mathbf {x} \in {\mathbb {R}}^{n} \setminus {\left\{ 0 \right\} }$.

Dines’ Theorem is proved in [12], and in the rest of this appendix we prove our lemmas in the order in which they were stated.

Proof of Lemma 5.1

Applying Lemma A.2 to the function ${h}\!\left( \mathbf {x}\right) := {Dc}\!\left( \mathbf {x}\right) $ we conclude that for every $\mathbf {v} \in {\mathbb {R}}^{n}$ the $n \times m$ matrix

$$\begin{aligned} \mathbf {A}_{\mathbf {v}} := \left( {\nabla ^2 \! c_1 \! \left( 0 \right) } \mathbf {v}, \dots , {\nabla ^2 \! c_m \! \left( 0 \right) } \mathbf {v} \right) \end{aligned}$$

has rank at most one, and Lemma A.1 yields the coefficients $\alpha _j$ and the matrix $\mathbf {H}$. $\square $

Proof of Lemma 5.2

For every $k \in \mathbb {N}{}$, Lemma A.3 yields $\gamma _k \in \mathcal {I}$ such that

$$\begin{aligned} \mathbf {x}^{\mathrm {T}{}} \left( \mathbf {A} + \gamma _k \mathbf {B} + \frac{1}{k} \mathbf {I}_{n \times n} \right) \mathbf {x} > 0 \end{aligned}$$

for all $\mathbf {x} \in {\mathbb {R}}^{n}$. Since the sequence $\gamma _k$ is bounded, it has a subsequence which converges to some $\gamma ^* \in \mathcal {I}$. This $\gamma ^*$ is as required by Lemma 5.2. $\square $

Proof of Lemma A.1

The Lemma holds when $n = 1$ or $m = 1$. Let us then assume that it holds when for $n - 1 \ge 1$ or $m - 1 \ge 1$ and show that it also holds for m and n. If some $\mathbf {H}_j$ is zero then we can take $\alpha _j = 0$ and use induction for

$$\begin{aligned} \mathbf {H}_1,\dots ,\mathbf {H}_{j-1}, \mathbf {H}_{j+1},\dots \mathbf {H}_m. \end{aligned}$$

Therefore, we can assume that $\mathbf {H}_j \ne 0$ for all j. It follows that $\mathbf {H}_1$ has an eigenvalue decomposition $\mathbf {H}_1 = \mathbf {Q} \mathbf {D} \mathbf {Q}^{\mathrm {T}{}}$ with $d_{11} \ne 0$. By replacing $\mathbf {H}_j$ by $\mathbf {Q}^{\mathrm {T}{}} \mathbf {H}_j \mathbf {Q}$ for all j, we can assume that $\mathbf {Q} = \mathbf {I}_{n \times n}$. Taking $\mathbf {v} = 1/d_{11} \mathbf {e}_1$, we obtain that $\mathbf {H}_1 \mathbf {v} = \mathbf {e}_1$ and the hypothesis that the matrix $\mathbf {A}_{\mathbf {v}}$ has rank one implies that $\mathbf {H}_j \mathbf {v} = \alpha _j \mathbf {e}_1$ for all j. Since the matrices $\mathbf {H}_j$ are symmetric, all of them have the form

$$\begin{aligned} \mathbf {H}_j = \left( \begin{array}{cc} \alpha _j &{}\quad 0_{n -1}^{\mathrm {T}{}} \\ 0_{n -1} &{}\quad \mathbf {H}'_j \end{array} \right) , \end{aligned}$$

for symmetric matrices $\mathbf {H}_j' \in {\mathbb {R}}^{\left( n-1 \right) \times \left( n-1 \right) } \setminus {\left\{ 0 \right\} }$ which satisfy the hypothesis of Lemma A.1 with $n = n - 1$. Therefore, there exists a matrix $\mathbf {H}' \ne 0$ and $\alpha _1',\dots \alpha _m'$ such that $\mathbf {H}_j' = \alpha _j' \mathbf {H}'$ for $j = 1,\dots ,m$. Since $\mathbf {H}_1 \mathbf {e}_1 = d_{11} \mathbf {e}_1 = \alpha _1 \mathbf {e}_1 \ne 0$, we have that $\alpha _1 \ne 0$. Let then $\mathbf {v}' \in {\mathbb {R}}^{n -1}$ be such that $\mathbf {H}' \mathbf {v}' \ne 0$, and write

$$\begin{aligned} \mathbf {v} = \left( \begin{array}{c} 1 \\ \mathbf {v}' \end{array} \right) . \end{aligned}$$

We have that

$$\begin{aligned} \mathbf {H}_1 \mathbf {v} = \alpha _1 \left( \begin{array}{c} 1 \\ \frac{\alpha _1'}{\alpha _1} \mathbf {H}' \mathbf {v}' \end{array} \right) \quad \mathrm {and} \quad \mathbf {H}_j \mathbf {v} = \left( \begin{array}{c} \alpha _j \\ \alpha _j' \mathbf {H}' \mathbf {v}' \end{array} \right) \quad \mathrm {for} \ j > 1, \end{aligned}$$

and the vectors $\mathbf {H}_1 \mathbf {v}$ and $\mathbf {H}_j \mathbf {v}$ are aligned by hypothesis. Moreover, either $\alpha _j \ne 0$ or $\alpha _j' \ne 0$, because $\mathbf {H}_j \ne 0$. It follows that $\alpha _j' = \alpha _1' \alpha _j / \alpha _1$ for all j. As a result, $\mathbf {H}_j = \alpha _j \mathbf {H}$, where

$$\begin{aligned} \mathbf {H} := \left( \begin{array}{cc} 1 &{}\quad 0^{\mathrm {T}{}}_{n-1} \\ 0_{n-1} &{}\quad \frac{\alpha _1'}{\alpha _1} \mathbf {H}' \end{array} \right) \end{aligned}$$

and we are done. $\square $

Proof of Lemma A.2

For every $\mathbf {v} \in {\mathbb {R}}^{n}$ and $1 \le \ell \le m$, the facts that $h \in C^1$ and ${h}\!\left( 0\right) = 0$ imply that

$$\begin{aligned} \lim _{\delta \rightarrow 0} \frac{1}{\delta } {h_\ell }\!\left( \delta \mathbf {v}\right) = {\mathrm {D}h_\ell }\!\left( 0\right) \mathbf {v} = \mathbf {A}_{\mathbf {v}}. \end{aligned}$$

Let $\rho > 0$ be such that if $\left\| \mathbf {B} - \mathbf {A}_{\mathbf {v}} \right\| \le \rho $ then ${\mathrm {rank}}\!\left( \mathbf {B}\right) \ge {\mathrm {rank}}\!\left( \mathbf {A_v}\right) $. Taking $\delta $ such that

$$\begin{aligned} \left\| \frac{1}{\delta } {h}\!\left( \delta \mathbf {v}\right) - \mathbf {A}_{\mathbf {v}} \right\| < \rho \end{aligned}$$

we obtain that

$$\begin{aligned} 1 \ge {\mathrm {rank}}\!\left( {h}\!\left( \delta \mathbf {v}\right) \right) = {\mathrm {rank}}\!\left( \frac{1}{\delta } {h}\!\left( \delta \mathbf {v}\right) \right) \ge {\mathrm {rank}}\!\left( \mathbf {A}_{\mathbf {v}}\right) . \end{aligned}$$

Therefore, ${\mathrm {rank}}\!\left( \mathbf {A}_{\mathbf {v}}\right) \le 1$, and we are done. $\square $

Proof of Lemma A.3

Let us write $\mathcal {I} = [a,b]$. If $a = b$ then we could simply take $\gamma ^* = a$ and we would be done. Therefore, we can assume that $a < b$. If $\left( u,v \right) \in {\mathcal {R}}\!\left( \mathbf {A},\mathbf {B}\right) $ then $ u + \gamma v \ge 0$ for some $\gamma \in [a,b]$. This implies that

$$\begin{aligned} u \ge -\left| \gamma \right| \left| v\right| \ge - \left( 1 + \left| a\right| + \left| b\right| \right) \left| v\right| \end{aligned}$$

Therefore, $\left( - 2 \left( 1 + \left| a\right| + \left| b\right| \right) , 1 \right) \not \in {\mathcal {R}}\!\left( \mathbf {A},\mathbf {B}\right) $ and by Dines’ Theorem ${\mathcal {R}}\!\left( \mathbf {A},\mathbf {B}\right) $ is a closed pointed cone. For each $\gamma \in \mathcal {I}$, the set

$$\begin{aligned} \mathcal {C}_\gamma := {\left\{ \left( u, v \right) \in {\mathbb {R}}^{2} : u + \gamma v \le 0 \right\} } \end{aligned}$$

(26)

is a closed convex cone, and

$$\begin{aligned} \mathcal {C} := \bigcap _{\gamma \in \mathcal {I}} \mathcal {C}_\gamma = \bigcap _{\gamma \in [a,b]} \mathcal {C}_\gamma \end{aligned}$$

is also a closed cone. Moreover, $\mathcal {C}$ is pointed because $a < b$. The hypothesis tells us that for every $\left( u,v \right) \in {\mathcal {R}}\!\left( \mathbf {A},\mathbf {B}\right) \setminus {\left\{ (0,0) \right\} }$ there exists $\gamma \in \mathcal {I}$ such that $\left( u,v \right) \not \in \mathcal {C}_\gamma $, and this implies that $\mathcal {C} \cap {\mathcal {R}}\!\left( \mathbf {A},\mathbf {B}\right) = {\left\{ \left( 0,0 \right) \right\} }$. Therefore, there exists $(c,d) \in {\mathbb {R}}^{2}$ such that

$$\begin{aligned} c u + d v > 0 \quad \mathrm {for} \; (u,v) \in {\mathcal {R}}\!\left( \mathbf {A},\mathbf {B}\right) \setminus {\left\{ \left( 0,0 \right) \right\} } \end{aligned}$$

(27)

and

$$\begin{aligned} c u + d v < 0 \quad \mathrm {for} \; (u,v) \in \mathcal {C} \setminus {\left\{ \left( 0,0 \right) \right\} }. \end{aligned}$$

(28)

Equation (26) shows that $(-1,0) \in \mathcal {C}_\gamma $ for all $\gamma $. Therefore, $(-1,0) \in \mathcal {C}$ and Eq. (28) implies that $c > 0$, and by dividing Eqs. (27) and (28) by c if necessary, we can assume that $c = 1$. The point $\left( a,-1 \right) $ belongs to $\mathcal {C}_\gamma $ for all $\gamma \ge a$. Therefore, $(a,-1) \in \mathcal {C}$ and Eq. (28) implies that $a - d < 0$, that is $d > a$. Similarly, The point $(-b,1)$ belongs to $\mathcal {C}_\gamma $ for all $\gamma \le b$, and $-b + d < 0$, that is, $d < b$. In summary, we have that $d \in [a,b]$. Finally, we take $\gamma ^* = d$, and Eq. (27) with $c = 1$ shows that this is a valid choice. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mascarenhas, W.F. A Simple Canonical Form for Nonlinear Programming Problems and Its Use. J Optim Theory Appl 181, 456–469 (2019). https://doi.org/10.1007/s10957-018-1381-7

Download citation

Received: 29 March 2018
Accepted: 17 August 2018
Published: 22 August 2018
Issue Date: 01 May 2019
DOI: https://doi.org/10.1007/s10957-018-1381-7

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

A Simple Canonical Form for Nonlinear Programming Problems and Its Use

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

On canonical duality theory and constrained optimization problems

Strong Duality for General Quadratic Programs with Quadratic Equality Constraints

The Role of grossone in Nonlinear Programming and Exact Penalty Methods

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Lemma A.1

Lemma A.2

Theorem A.1

Lemma A.3

Proof of Lemma 5.1

Proof of Lemma 5.2

Proof of Lemma A.1

Proof of Lemma A.2

Proof of Lemma A.3

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Navigation

A Simple Canonical Form for Nonlinear Programming Problems and Its Use

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

On canonical duality theory and constrained optimization problems

Strong Duality for General Quadratic Programs with Quadratic Equality Constraints

The Role of grossone in Nonlinear Programming and Exact Penalty Methods

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Lemma A.1

Lemma A.2

Theorem A.1

Lemma A.3

Proof of Lemma 5.1

Proof of Lemma 5.2

Proof of Lemma A.1

Proof of Lemma A.2

Proof of Lemma A.3

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Search

Navigation