Abstract
The polyhedral projection problem arises in various applications. To efficiently solve the dual problem, one of the crucial issues is to safely identify zero-elements as well as the signs of nonzero elements at the optimal solution. In this paper, relying on its nonsmooth dual problem and active set techniques, we first propose a Duality-Gap-Active-Set strategy (DGASS) to effectively identify the indices of zero-elements and the signs of nonzero entries of the optimal solution. Serving as an efficient acceleration strategy, DGASS can be embedded into certain iterative methods. In particular, by applying DGASS to both the proximal gradient algorithm (PGA) and the proximal semismooth Newton algorithm (PSNA), we propose the method of PGA-DGASS and PSNA-DGASS, respectively. Global convergence and local quadratic convergence rate are discussed. We report on numerical results over both synthetic and real data sets to demonstrate the high efficiency of the two DGASS-accelerated methods.
Similar content being viewed by others
Data Availability
Enquiries about data availability should be directed to the authors.
Notes
To see (2.11) and (2.4) share the same set of solutions, let \(\Lambda \) be the solution set of (1.10). By Proposition 2.1, all \(\lambda ^*\in \Lambda \) satisfy \(\lambda ^*\in {{{\mathcal {Q}}}}\), and therefore \(\Lambda \subseteq {{{\mathcal {Q}}}}\); as \(\lambda ^*\) attains the minimum of \(D(\lambda )\) over \(\lambda \in {\mathbb {R}}^m\), \(\Lambda \) is the solution set of (2.4). Analogously, by Proposition 2.2, all \(\lambda ^*\in \Lambda \) satisfy \(\lambda ^*\in {{{\mathcal {Q}}}}_{{{\mathcal {C}}}}\), yielding \(\Lambda \subseteq \mathcal{Q}_{{{\mathcal {C}}}}\), i.e., \(\Lambda \) is the solution set of (2.11). Thus, \(\Lambda \) is the solution set for (1.10),(2.4) and (2.11).
In the case that \({\tilde{D}}(\lambda ^1)<{\tilde{D}}(\lambda ^0)\), from the proof of Theorem 4.1, we know that \({\tilde{D}}(\lambda ^k)\le {\tilde{D}}_R^k\) for all k, and the sequence \(\{{\tilde{D}}_R^k\}\) is monotonically decreasing. Thus, for all sufficiently large k, \({\tilde{D}}(\lambda ^k+d_N^k)\) does not exceed \({\tilde{D}}(\lambda ^0)\). The trivial case \({\tilde{D}}(\lambda ^1)={\tilde{D}}(\lambda ^0)\) leads to \(\Delta _N^0=0\) due to \({\tilde{D}}(\lambda ^0)={\tilde{D}}_R^0\) and the sufficient descent condition (4.9); using the positive definiteness of \(H^0\) and the definition of \(\Delta _N^0\), we have \(d_N^0=0\), and by (4.8), the initial point \(\lambda ^0\) is optimal.
The functions randi and rand are used to generate uniformly distributed pseudo-random integers and uniformly distributed pseudo-random numbers, respectively, and min(v) and max(v) compute the minimum and maximum values of v, respectively. Also, in order to ensure \(y=0\) is a feasible solution for (1.1), the generated random A and y satisfy \(\texttt {min(A*y)<0}\) and \(\texttt {max(A*y)>0}\).
References
Adler, I., Hu, Z.T. and Lin, T.: New proximal Newton-type methods for convex optimization. In 59th IEEE conference on decision and control, pp. 4828–4835. IEEE, (2020)
Bagirov, A., Karmitsa, N., Mäkelä, M.M.: Introduction to nonsmooth optimization: theory, practice and software, vol. 12. Springer, (2014)
Barzilai, J., Borwein, J.M.: Two-point step size gradient methods. IMA J. Numer. Anal. 8(1), 141–148 (1988)
Beck, A.: First-order methods in optimization. SIAM, (2017)
Becker, S., Fadili, J.: A quasi-Newton proximal splitting method. In: Advances in neural information processing systems 25, 2618–2626 (2012)
Becker, S., Fadili, J., Ochs, P.: On quasi-Newton forward-backward splitting: proximal calculus and convergence. SIAM J. Optim. 29(4), 2445–2481 (2019)
Birgin, E.G., Martínez, J.M., Raydan, M.: Spectral projected gradient methods: review and perspectives. J. Stat. Softw. 60, 1–21 (2014)
Boyd, S., Diaconis, P., Xiao, L.: Fastest mixing Markov chain on a graph. SIAM Rev. 46(4), 667–689 (2004)
Censor, Y.: Computational acceleration of projection algorithms for the linear best approximation problem. Linear Algebra Appl. 416(1), 111–123 (2006)
Clarke, F.H.: Optimization and Nonsmooth Analysis. SIAM, (1990)
Dai, Y.H.: Alternate step gradient method. Optimization 52(4–5), 395–415 (2003)
Dolan, E.D., Moré, J.J.: Benchmarking optimization software with performance profiles. Math. Program. 91(2), 201–213 (2002)
Drusvyatskiy, D., Lewis, A.S.: Error bounds, quadratic growth, and linear convergence of proximal methods. Math. Oper. Res. 43(3), 919–948 (2018)
Facchinei, F., Pang, J.-S.: Finite-dimensional Variational Inequalities and Complementarity Problems. Springer, (2003)
Fercoq, O., Gramfort, A., Salmon, J.: Mind the duality gap: safer rules for the lasso. In: International conference on machine learning, pp. 333–342. PMLR, (2015)
Fletcher, R.: On the Barzilai-Borwein method. In: Optimization and control with applications, pp. 235–256. Springer, (2005)
Gabidullina, Z.: A linear separability criterion for sets of Euclidean space. J. Optim. Theory Appl. 158(1), 145–171 (2013)
Gao, B., Son, N.T., Absil, P.-A., Stykel, T.: Riemannian optimization on the symplectic Stiefel manifold. SIAM J. Optim. 31(2), 1546–1575 (2021)
Ghaoui, L.E., Viallon, V., Rabbani, T.: Safe feature elimination for the lasso and sparse supervised learning problems. arXiv preprint arXiv:1009.4219, (2010)
Gotoh, J., Takeda, A., Tono, K.: DC formulations and algorithms for sparse optimization problems. Math. Program. 169(1), 141–176 (2018)
Hager, W.W., Zhang, H.: A new active set algorithm for box constrained optimization. SIAM J. Optim. 17(2), 526–557 (2006)
Hager, W.W., Zhang, H.: An active set algorithm for nonlinear optimization with polyhedral constraints. Sci. China Math. 59(8), 1525–1542 (2016)
Hager, W.W., Zhang, H.: Projection onto a polyhedron that exploits sparsity. SIAM J. Optim. 26(3), 1773–1798 (2016)
Hiriart-Urruty, J.-B., Lemaréchal, C.: Fundamentals of Convex Analysis. Springer Science and Business Media, (2004)
Hu, J., Liu, X., Wen, Z.W., Yuan, Y.X.: A brief introduction to manifold optimization. J. Oper. Res. Soc. China 8(2), 199–248 (2020)
Huang, Y.-K., Dai, Y.-H., Liu, X.-W.: Equipping the Barzilai-Borwein method with the two dimensional quadratic termination property. SIAM J. Optim. 31(4), 3068–3096 (2021)
Huang, Y.-K., Dai, Y.-H., Liu, X.-W., Zhang, H.: On the acceleration of the Barzilai-Borwein method. Comput. Optim. Appl. 81(3), 717–740 (2022)
Laurent, C.: Fast projection onto the simplex and the l1 ball. Math. Program. 158, 575–585 (2016)
Lee, J.D., Sun, Y., Saunders, M.A.: Proximal Newton-type methods for minimizing composite functions. SIAM J. Optim. 24(3), 1420–1443 (2014)
Li, X., Sun, D., Toh, K.-C.: A highly efficient semismooth Newton augmented Lagrangian method for solving Lasso problems. SIAM J. Optim. 28(1), 433–458 (2018)
Mifflin, R.: Semismooth and semiconvex functions in constrained optimization. SIAM J. Control. Optim. 15(6), 959–972 (1977)
Nakayama, S., Narushima, Y., Yabe, H.: Inexact proximal memoryless quasi-Newton methods based on the Broyden family for minimizing composite functions. Comput. Optim. Appl. 79(1), 127–154 (2021)
Ndiaye, E., Fercoq, O., Gramfort, A., Salmon, J.: Gap safe screening rules for sparse multi-task and multi-class models. Adv. Neural Inf. Process. Syst. 28, 811–819 (2015)
Ndiaye, E., Fercoq, O., Gramfort, A., Salmon, J.: Gap safe screening rules for Sparse-Group Lasso. Adv. Neural Inf. Process. Syst. 29, 388–396 (2016)
Ndiaye, E., Fercoq, O., Gramfort, A., Salmon, J.: Gap safe screening rules for sparsity enforcing penalties. J. Mach. Learn. Res. 18(1), 4671–4703 (2017)
Nesterov, Y.E.: A method for solving a convex programming problem with convergence rate O(\(1/k^2\)). In Doklady A.N. (Eds.) Russian Academy of Sciences, vol. 269, pp. 543–547. (1983)
Ogawa, K., Suzuki, Y., Takeuchi, I.: Safe screening of non-support vectors in pathwise SVM computation. In: International conference on machine learning, pp. 1382–1390. PMLR, (2013)
Olbrich, J.: Screening rules for convex problems. Master’s thesis, ETH-Zürich, (2015)
Patriksson, M.: Cost approximation: a unified framework of descent algorithms for nonlinear programs. SIAM J. Optim. 8(2), 561–582 (1998)
Panagiotis, P. and Alberto B.: Proximal Newton methods for convex composite optimization. In: 52nd IEEE conference on decision and control, pp 2358–2363. IEEE, (2013)
Qi, L., Sun, J.: A nonsmooth version of Newton’s method. Math. Program. 58(1), 353–367 (1993)
Raydan, M.: The Barzilai and Borwein gradient method for the large scale unconstrained minimization problem. SIAM J. Optim. 7(1), 26–33 (1997)
Rockafellar, T.: Convex analysis. Princeton University Press, (1970)
Shen, C., Liu, X.: Solving nonnegative sparsity-constrained optimization via DC quadratic-piecewise-linear approximations. J. Global Optim. 81(4), 1019–1055 (2021)
Shen, C., Wang, Y., Xue, W., Zhang, L.-H.: An accelerated active-set algorithm for a quadratic semidefinite program with general constraints. Comput. Optim. Appl. 78(1), 1–42 (2021)
Shen, C., Xue, W., Zhang, L.-H., Wang, B.: An active-set proximal-Newton algorithm for \( \ell _1 \) regularized optimization problems with box constraints. J. Sci. Comput. 85(3), 1–34 (2020)
Stetsyuk, P.I., Nurminski, E.A.: Nonsmooth penalty and subgradient algorithms to solve the problem of projection onto a polytope. Cybern. Syst. Anal. 46(1), 51 (2010)
Sun, D., Sun, J.: Semismooth matrix-valued functions. Math. Oper. Res. 27(1), 150–169 (2002)
Torrisi, G., Grammatico, S., Smith, R.S., Morari, M.: A projected gradient and constraint linearization method for nonlinear model predictive control. SIAM J. Control Optim. 56(3), 1968–1999 (2018)
Wang, B., Lin, L., Liu, Y.-J.: Efficient projection onto the intersection of a half-space and a box-like set and its generalized Jacobian. Optimization 71(4), 1073–1096 (2022)
Wang, J. and Ye, J.: Safe screening for multi-task feature learning with multiple data matrices. In: International conference on machine learning, pp. 1747–1756. PMLR, (2015)
Wang, J., Zhou, J., Liu, J., Wonka, P., Ye, J.: A safe screening rule for sparse logistic regression. Adv. Neural Inf. Process. Syst. 2, 1053–1061 (2014)
Wen, Z., Yin, W., Goldfarb, D., Zhang, Y.: A fast algorithm for sparse reconstruction based on shrinkage, subspace optimization, and continuation. SIAM J. Sci. Comput. 32(4), 1832–1857 (2010)
Wright, S.J., Nowak, R.D., Figueiredo, M.A.T.: Sparse reconstruction by separable approximation. IEEE Trans. Signal Process. 57(7), 2479–2493 (2009)
Zhang, H., Hager, W.W.: A nonmonotone line search technique and its application to unconstrained optimization. SIAM J. Optim. 14(4), 1043–1056 (2004)
Zhao, X., Yao, J.-C.: Linear convergence of a nonmonotone projected gradient method for multiobjective optimization. J. Global Optim. 82(3), 577–594 (2022)
Zhao, X.-Y., Sun, D., Toh, K.-C.: A Newton-CG augmented lagrangian method for semidefinite programming. SIAM J. Optim. 20(4), 1737–1765 (2010)
Zhou, B., Gao, L., Dai, Y.-H.: Gradient methods with adaptive step-sizes. Comput. Optim. Appl. 35(1), 69–86 (2006)
Acknowledgements
The authors are grateful to the associate editor and the two anonymous referees for their valuable comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have not disclosed any competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The work of the first author was supported in part by the National Natural Science Foundation of China NSFC-72025201.
The work of the third author was supported in part by the National Natural Science Foundation of China NSFC-12071332.
The work of the last author was supported by the National Natural Science Foundation of China NSFC-11971118.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, Y., Shen, C., Zhang, LH. et al. Proximal Gradient/Semismooth Newton Methods for Projection onto a Polyhedron via the Duality-Gap-Active-Set Strategy. J Sci Comput 97, 3 (2023). https://doi.org/10.1007/s10915-023-02302-6
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10915-023-02302-6