Abstract
Limited-memory versions of quasi-Newton methods are efficient approaches to solving large-scale optimization problems in a Euclidean space. In particular, a quasi-Newton symmetric rank-one update used in a trust-region setting has proven to be an effective method. In this paper, a limited-memory Riemannian symmetric rank-one trust-region method with a restart strategy is proposed by combining the intrinsic representation of tangent vectors with a recently proposed efficient solver for the Euclidean limited-memory symmetric rank-one trust-region subproblem. The global convergence is established under the assumptions that the vector transport is isometric and the cost function is Lipschitz continuously differentiable. The computational and spatial complexity are analyzed, and detailed implementations are described. The performance of the proposed method is compared to limited-memory Riemannian BFGS method and other state-of-the-art methods using problems from matrix completion, independent component analysis, phase retrieval, and blind deconvolution. The proposed method is novel for problems on Riemannian and Euclidean spaces.






Similar content being viewed by others
Data Availability
The datasets generated during and analysed during the current study are available from the corresponding author on reasonable request.
Notes
A preliminary version of the paper can be found in [21]. This paper is different from [21] in many places. In [21], i) it is not shown that all the Hessian approximations are bounded from above. Therefore, the global convergence is guaranteed by the heuristic of truncating all Hessian approximations if their norms are too large, ii) the vector transport may not be isometric, iii) the complexity analysis and implementation are not derived and described, iv) the LRTRSR1 with restarting using intrinsic representation is not given, v) only preliminary experiments are performed.
In [4], a variant of the subproblem is used, i.e., the norm for defining the radius varies in every iteration.
Conventionally, a vector transport is defined to be a smooth mapping, see e.g., [2]. Here, we do not require such a property by default and the smoothness is imposed only when needed.
The pseudo-inverse is independent of the Riemannian metric. It is the conventional pseudo-inverse with respect to the Euclidean metric in \({\mathbb {R}}^n\).
Note that in this case, \(y_k^T s_k\) is always positive since \(y_k^T s_k = s_k^T A s_k > 0\).
In this case, \(y_k = {\bar{G}}_k s_k\), where \({\bar{G}}_k = \int _0^1 \nabla ^2 f(x_{k} + t(x_{k+1} - x_k) ) dt\) denotes the average Hessian. If \(x_k\) and \(x_{k+1}\) are sufficiently close to \(x_*\), then the average Hessian can be sufficiently close to \(\nabla f(x_*)\) such that \({\bar{G}}_k\) is symmetric positive definite. It follows that \(y_k^T s_k = s_k^T {\bar{G}}_k s_k >0\).
Intuitively, it is natural to find a nonnegative root of the function \(\Vert \varphi (\sigma )\Vert _2 - \varDelta _k\). However, this function may cause numerical difficulties. We refer to [31, pp. 85 and 86] for more details.
Theoretically, \(\sigma _*\) needs to be the exact root. However, such \(\sigma _*\) is impractical numerically due to numerical error and the accuracy of the Newton method. We choose this stopping criterion since it usually reaches high accuracy and is robust. In our numerical experiments, the number of iterations of the Newton’s method for finding \(\sigma _*\) is usually round 3 and the final magnitude of \(\phi \) is usually around \(10^{-16}\) or lower. Occasionally, the number of iterations reaches the maximum \(2 \ell + 10\). In these cases, the final magnitudes of \(\phi \) are usually around \(10^{-10}\).
Version of August 14 2012 is available at: http://lmafit.blogs.rice.edu
References
Absil, P.-A., Baker, C.G., Gallivan, K.A.: Trust-region methods on Riemannian manifolds. Found. Comput. Math. 7(3), 303–330 (2007)
Absil, P.-A., Mahony, R., Sepulchre, R.: Optimization algorithms on matrix manifolds. Princeton University Press, Princeton, NJ (2008)
Boothby, W. M.: An introduction to differentiable manifolds and Riemannian geometry. Academic Press, second edition (1986)
Brust, J, Burdakov, O, Erway, J.B., Marcia, R.F., Yuan, Y.-X.: ALGORITHM XXX: SC-SR1: MATLAB software for solving shape-changing L-SR1 trust-region subproblems. arXiv:1607.03533v2 (2018)
Brust, J., Erway, J.B., Marcia, R.F.: On solving L-SR1 trust-region subproblems. Comput. Optim. Appl. 66(2), 245–266 (2017)
Burdakov, O., Gong, L., Zikrin, S., Yuan, Y.X.: On efficiently combining limited-memory and trust-region techniques. Math. Program. Comput. 9(1), 101–134 (2017)
Byrd, R.H., Nocedal, J., Schnabel, R.B.: Representations of quasi-Newton matrices and their use in limited memory methods. Math. Program. 63(1–3), 129–156 (1994)
Candés, E.J., Li, X., Soltanolkotabi, M.: Phase retrieval via Wirtinger flow: theory and algorithms. IEEE Trans. Inf. Theory 64(4), 1985–2007 (2016)
Cherian, A., Sra, S.: Riemannian dictionary learning and sparse coding for positive definite matrices. IEEE Trans. Neural Netw. Learning Syst. 28(12), 2859–2871 (2017)
Conn, A.R., Gould, N.I.M., Toint, P.L.: Convergence of quasi-Newton matrices generated by the symmetric rank one update. Math. Program. 50(1–3), 177–195 (1991). https://doi.org/10.1007/BF01594934
Cunningham, J.P., Ghahramani, Z.: Linear dimensionality reduction: Survey, insights, and generalizations. J. Mach. Learn. Res. 16(89), 2859–2900 (2015)
Dolan, E.D., Moré, J.J.: Benchmarking optimization software with performance profiles. Math. Program. 91(2), 201–213 (2001)
Edelman, A., Arias, T.A., Smith, S.T.: The geometry of algorithms with orthogonality constraints. SIAM J. Matrix Anal. Appl. 20(2), 303–353 (1998). https://doi.org/10.1137/S0895479895290954
Erway, J.B., Gill, P.E.: A subspace minimization method for the trust-region step. SIAM J. Optim. 20(3), 1439–1461 (2010)
Erway, J.B., Gill, P.E., Griffin, J.D.: Iterative methods for finding a trust-region step. SIAM J. Optim. 20(2), 1110–1131 (2009)
Golub, G. H., Van Loan, C. F.: Matrix computations. Johns Hopkins Studies in the Mathematical Sciences. Johns Hopkins University Press, third edition (1996)
Huang, W.: Optimization algorithms on Riemannian manifolds with applications. PhD thesis, Florida State University, Department of Mathematics (2013)
Huang, W., Absil, P.-A., Gallivan, K.A.: A Riemannian symmetric rank-one trust-region method. Math. Program. 150(2), 179–216 (2015)
Huang, W., Absil, P.-A., Gallivan, K.A.: Intrinsic representation of tangent vectors and vector transport on matrix manifolds. Numer. Math. 136(2), 523–543 (2017)
Huang, W., Absil, P.-A., Gallivan, K.A., Hand, P.: ROPTLIB: an object-oriented C++ library for optimization on Riemannian manifolds. ACM Transactions on Mathematical Software 4(44), 43:1-43:21 (2018)
Huang, W., Gallivan, K. A.: A limited-memory Riemannian symmetric rank-one trust-region method with an efficient algorithm for its subproblem. In Proceedings of the 24th Internaltional Symposium on Mathematical Theory of Networks and Systems https://www.math.fsu.edu/~whuang2/papers/ALMRTRSR1.htm, accepted (2021)
Huang, W., Gallivan, K.A., Absil, P.-A.: A Broyden Class of Quasi-Newton Methods for Riemannian Optimization. SIAM J. Optim. 25(3), 1660–1685 (2015)
Huang, W., Gallivan, K.A., Srivastava, A., Absil, P.-A.: Riemannian optimization for registration of curves in elastic shape analysis. J. Mathe. Imaging Vision 54(3), 320–343 (2015). https://doi.org/10.1007/s10851-015-0606-8
Huang, W., Absil, P.-A., Gallivan, K.A.: A Riemannian BFGS method without differentiated retraction for nonconvex optimization problems. SIAM J. Optim. 28(1), 470–495 (2018)
Huang, W., Gallivan, K.A., Zhang, X.: Solving PhaseLift by low rank Riemannian optimization methods for complex semidefinite constraints. SIAM J. Sci. Comput. 39(5), B840–B859 (2017)
Huang, W., Hand, P.: Blind deconvolution by a steepest descent algorithm on a quotient manifold. SIAM J. Imag. Sci. 11(4), 2757–2785 (2018)
Iannazzo, B., Porcelli, M.: The riemannian barzilai-borwein method with nonmonotone line search and the matrix geometric mean computation. IMA J. Numer. Anal. 38(1), 495–517 (2018)
Jeuris, B., Vandebril, R., Vandereycken, B.: A survey and comparison of contemporary algorithms for computing the matrix geometric mean. Electron. Trans. Numer. Anal. 39, 379–402 (2012)
Kasai, H., Mishra, B.: Low-rank tensor completion: a riemannian manifold preconditioning approach. volume 48 of Proceedings of Machine Learning Research, pages 1012–1021, New York, New York, USA, 20–22 Jun 2016. PMLR
Liu, D.C., Nocedal, J.: On the limited memory BFGS method for large scale optimization. Math. Program. 45(1), 503–528 (1989)
Nocedal, J., Wright, S. J.: Numerical Optimization. Springer, second edition, (2006)
Nocedal, J.: Updating quasi-Newton matrices with limited storage. Math. Comput. 35(151), 773–773 (1980)
Omar, D.G., Erway, J.B., Marcia, R.F.: Compact representation of the full broyden class of quasi-newton updates. Numerical Linear Algebra with Applications, 25, e2186 (2017)
Reed, Mb.: L-Broyden methods: a generalization of the L-BFGS method to the limited-memory Broyden family. Int. J. Comput. Math. 86(4), 606–615 (2009)
Ring, W., Wirth, B.: Optimization methods on Riemannian manifolds and their application to shape space. SIAM J. Optim. 22(2), 596–627 (2012). https://doi.org/10.1137/11082885X
Sato, H.: A Dai-Yuan-type Riemannian conjugate gradient method with the weak Wolfe conditions. Comput. Optim. Appl. 64(1), 101–118 (2016)
Sato, H., Iwai, T.: A new, globally convergent Riemannian conjugate gradient method. Optimization 64(4), 1011–1031 (2015)
Sra, S., Hosseini, R.: Conic geometric optimization on the manifold of positive definite matrices. SIAM J. Optim. 25(1), 713–739 (2015)
Sun, J., Qing, Q., Wright, J.: A Geometric Analysis of Phase Retrieval. Found. Comput. Math. 18(5), 1131–1198 (2018)
Theis, F. J., Cason, T. P., Absil, P.-A.: Soft dimension reduction for ICA by joint diagonalization on the Stiefel manifold. Proceedings of the 8th International Conference on Independent Component Analysis and Signal Separation, 5441, 354–361 (2009)
Vandereycken, B.: Low-rank matrix completion by Riemannian optimization–extended version. SIAM J. Optim. 23(2), 1214–1236 (2013)
Wen, Z., Yin, W., Zhang, Y.: Solving a low-rank factorization model for matrix completion by a nonlinear successive over-relaxation algorithm. Math. Program. Comput. 4(4), 333–361 (2012). https://doi.org/10.1007/s12532-012-0044-1
Yuan, X., Huang, W., Absil, P.-A., Gallivan, K. A.: Computing the matrix geometric mean: Riemannian vs Euclidean conditioning, implementation techniques, and a Riemannian BFGS method. Technical Report UCL-INMA-2019.05, U.C.Louvain (2019)
Zhou, X., Yang, C., Zhao, H., Weichuan, Y.: Low-rank modeling and its applications in image analysis. ACM Computing Surveys (CSUR) 47(2), 1–33 (2014)
Zhu, X.: A Riemannian conjugate gradient method for optimization on the Stiefel manifold. Comput. Optim. Appl. 67(1), 73–110 (2017)
Funding
Wen Huang was partially supported by the Fundamental Research Funds for the Central Universities (NO. 20720190060) and National Natural Science Foundation of China (NO. 12001455).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
A preliminary version of this paper can be found in [21]
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Huang, W., Gallivan, K.A. A Limited-Memory Riemannian Symmetric Rank-One Trust-Region Method with a Restart Strategy. J Sci Comput 93, 1 (2022). https://doi.org/10.1007/s10915-022-01962-0
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10915-022-01962-0