Abstract
In this study, we propose a novel block coordinate DCA based method for tackling the large-scale kernel SVM. The proposed method employs a unified scheme that is capable of handling almost all common losses in SVM. Owing to the block coordinate approach, the learning algorithm can deal with the high dimensionality of the optimization variables by updating one block of coordinates at a time to sufficiently decrease the objective value while keeping the other blocks fixed. As a result, calculation time can be significantly reduced while preventing the computer’s memory from overflowing, which is crucial for computing in the context of big data. Numerical experiments are conducted intensively, which shows the algorithm’s merits in both accuracy and computational cost.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
References
Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: Proceedings of COLT 1992, pp. 144–152 (1992)
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 1–27 (2011)
Chou, H.Y., Lin, P.Y., Lin, C.J.: Dual coordinate-descent methods for linear one-class SVM and SVDD. In: Proceedings SIAM International Conference Data Mining, pp. 181–189. SIAM (2020)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
Hsieh, C.J., Chang, K.W., Lin, C.J., Keerthi, S.S., Sundararajan, S.: A dual coordinate descent method for large-scale linear SVM. In: Proceedings of the 25th International Conference on Machine Learning, pp. 408–415 (2008)
Le, H.M., Le Thi, H.A., Nguyen, M.C.: Sparse semi-supervised support vector machines by dc programming and DCA. Neurocomputing 153, 62–76 (2015)
Le Thi, H.A., Le, H.M., Nguyen, V.V., Pham Dinh, T.: A dc programming approach for feature selection in support vector machines learning. Adv. Data Anal. Classif. 2(3), 259–278 (2008). https://doi.org/10.1007/s11634-008-0030-7
Le Thi, H.A., Le, H.M., Pham Dinh, T.: Feature selection in machine learning: an exact penalty approach using a difference of convex function algorithm. Mach. Learn. 101(1), 163–186 (2015)
Le Thi, H.A., Pham Dinh, T.: The DC (difference of convex functions) programming and DCA revisited with dc models of real world nonconvex optimization problems. Ann. Oper. Res. 133(1), 23–46 (2005)
Le Thi, H.A., Pham Dinh, T.: DC programming and DCA: thirty years of developments. Math. Program. 169(1), 5–68 (2018)
Lee, C.P., Roth, D.: Distributed box-constrained quadratic optimization for dual linear SVM. In: ICML, pp. 987–996. PMLR (2015)
Lee, C.P., Wright, S.J.: Random permutations fix a worst case for cyclic coordinate descent. IMA J. Numer. Anal. 39(3), 1246–1275 (2019)
Lu, Z., Xiao, L.: On the complexity analysis of randomized block-coordinate descent methods. Math. Program. 152, 615–642 (2014). https://doi.org/10.1007/s10107-014-0800-2
Nutini, J., Laradji, I., Schmidt, M.: Let’s make block coordinate descent go fast: faster greedy rules, message-passing, active-set complexity, and superlinear convergence. arXiv preprint arXiv:1712.08859 (2017)
Pham Dinh, T., Le Thi, H.A.: Convex analysis approach to DC programming: theory, algorithms and applications. Acta Math. Vietnam 22(1), 289–355 (1997)
Pham Dinh, T., Le Thi, H.A.: A DC optimization algorithm for solving the trust-region subproblem. SIAM J. Optim. 8(2), 476–505 (1998)
Pham Dinh, T., Le Thi, H.A.: Recent advances in DC programming and DCA. In: Nguyen, N.T., Le-Thi, H.A. (eds.) Transactions on Computational Intelligence XIII. LNCS, vol. 8342, pp. 1–37. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-54455-2_1
Phan, D.N., Le Thi, H.A.: Group variable selection via lp,0 regularization and application to optimal scoring. Neural Netw. 118, 220–234 (2019)
Qin, Z., Scheinberg, K., Goldfarb, D.: Efficient block-coordinate descent algorithms for the group lasso. Math. Program. Comput. 5(2), 143–169 (2013)
Schölkopf, B., Herbrich, R., Smola, A.J.: A generalized representer theorem. In: Helmbold, D., Williamson, B. (eds.) COLT 2001. LNCS (LNAI), vol. 2111, pp. 416–426. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44581-1_27
Shalev-Shwartz, S., Ben-David, S.: Understanding machine learning: from theory to algorithms. Cambridge University Press (2014)
Vapnik, V.: The Nature of Statistical Learning Theory. Springer Science & Business Media. Springer, New York (1999).https://doi.org/10.1007/978-1-4757-3264-1
Zhao, Z., Zhang, R., Cox, J., Duling, D., Sarle, W.: Massively parallel feature selection: an approach based on variance preservation. Mach. Learn. 92(1), 195–220 (2013). https://doi.org/10.1007/s10994-013-5373-4
Zhou, S., Zhou, W.: Unified SVM algorithm based on LS-DC loss. Mach. Learn. 1–28 (2021). https://doi.org/10.1007/s10994-021-05996-7
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Pham, V.T., Luu, H.P.H., Le Thi, H.A. (2022). A Block Coordinate DCA Approach for Large-Scale Kernel SVM. In: Nguyen, N.T., Manolopoulos, Y., Chbeir, R., Kozierkiewicz, A., Trawiński, B. (eds) Computational Collective Intelligence. ICCCI 2022. Lecture Notes in Computer Science(), vol 13501. Springer, Cham. https://doi.org/10.1007/978-3-031-16014-1_27
Download citation
DOI: https://doi.org/10.1007/978-3-031-16014-1_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16013-4
Online ISBN: 978-3-031-16014-1
eBook Packages: Computer ScienceComputer Science (R0)