Abstract
This paper deals with the novel PermonSVM machine learning tool. PermonSVM is a part of our PERMON toolbox. It implements the linear two-class Support Vector Machines. PermonSVM is built on top of PermonQP (PERMON module for quadratic programming) which in turn uses PETSc. The main advantage of PermonSVM is that it is parallel. The parallelism comes from a distribution of matrices and vectors. The MPRGP algorithm, implemented in PermonQP, is used as a solver of the quadratic programming problem arising from the dual SVM formulation. The scalability of MPRGP was proven in problems of mechanics with more than billion of unknowns solved on tens of thousands of cores. Apart from the scalability of our approach, we also investigate the relations between training rate, hyperplane margin, the value of the dual functional, and the norm of the projected gradient.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The penalty C is often called a regularization parameter in ML communities.
References
ExCAPE: exascale compound activity prediction. http://www.excape-h2020.eu
LIBSVM data: classification, regression, and multi-label. https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/
IT4Innovations: Salomon cluster documentation - hardware overview. National Supercomputing Center, VSB-Technical University of Ostrava (2017). https://docs.it4i.cz/salomon-cluster-documentation/hardware-overview
Balay, S., Abhyankar, S., Adams, M.F., Brown, J., Brune, P., Buschelman, K., Eijkhout, V., Gropp, W.D., Kaushik, D., Knepley, M.G., McInnes, L.C., Rupp, K., Smith, B.F., Zhang, H.: PETSc - Portable, Extensible Toolkit for Scientific Computation. http://www.mcs.anl.gov/petsc
Brown, M., Grundy, W., Lin, D., Cristianini, N., Sugnet, C., Furey, T., Ares Jr., M., Haussler, D.: Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc. Nat. Acad. Sci. U.S.A. 97(1), 262–267 (2000)
Cherkassky, V., Mulier, F.M.: Learning from Data: Concepts, Theory, and Methods. Wiley-IEEE Press, Hoboken (2007)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
Dostál, Z.: Optimal Quadratic Programming Algorithms, with Applications to Variational Inequalities. SOIA, vol. 23. Springer, New York (2009). https://doi.org/10.1007/b138610
Foody, G.M., Mathur, A.: The use of small training sets containing mixed pixels for accurate hard image classification: training on mixed spectral responses for classification by a SVM. Remote Sens. Environ. 103(2), 179–189 (2006)
Hapla, V., Horák, D., Pecha, M.: PermonSVM (2017). http://permon.it4i.cz/permonsvm.htm
Hapla, V., Horák, D., Čermák, M., Kružík, J., Pospíšil, L., Sojka, R.: PermonQP (2015). http://permon.it4i.cz/qp/
Horak, D., Dostal, Z., Hapla, V., Kruzik, J., Sojka, R., Cermak, M.: Projector-less TFETI for contact problems: preliminary results. In: Civil-Comp Proceedings, vol. 111 (2017)
Ma, J., Saul, L., Savage, S., Voelker, G.: Identifying suspicious URLs: an application of large-scale online learning, pp. 681–688 (2009). Cited By 173
Munson, T., Sarich, J., Wild, S., Benson, S., McInnes, L.C.: TAO users manual. Technical report ANL/MCS-TM-322. Argonne National Laboratory (2015). http://tinyurl.com/tao-man
Rychetsky, M.: Algorithms and Architectures for Machine Learning Based on Regularized Neural Networks and Support Vector Approaches (Berichte Aus Der Informatik). Shaker Verlag GmbH, Herzogenrath (2001)
Shi, J., Lee, W.J., Liu, Y., Yang, Y., Wang, P.: Forecasting power output of photovoltaic systems based on weather classification and support vector machines. IEEE Trans. Ind. Appl. 48(3), 1064–1069 (2012)
Smith, B.F., et al.: PETSc users manual. Technical report ANL-95/11 - Revision 3.5. Argonne National Laboratory (2016). http://tinyurl.com/petsc-man
Vishnu, A., Narasimhan, J., Holder, L., Kerbyson, D., Hoisie, A.: Fast and accurate support vector machines on large scale systems. In: 2015 IEEE International Conference on Cluster Computing, pp. 110–119, September 2015
Acknowledgments
This work was supported by the Ministry of Education, Youth and Sports from the National Programme of Sustainability (NPU II) project IT4Innovations excellence in science (LQ1602), and from the Large Infrastructures for Research, Experimental Development and Innovations project IT4Innovations National Supercomputing Center (LM2015070); by the internal student grant competition project SGS No. SP2018/165; by projects LO1404: Sustainable development of CENET, and CZ.1.05/2.1.00/19.0389: Research Infrastructure Development of the CENET; and by the Czech Science Foundation (GACR) projects no. 15-18274S and 17-22615S. We would also like to acknowledge partners in the ExCAPE project for providing us with training datasets related to the Pfam protein database.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Kružík, J., Pecha, M., Hapla, V., Horák, D., Čermák, M. (2018). Investigating Convergence of Linear SVM Implemented in PermonSVM Employing MPRGP Algorithm. In: Kozubek, T., et al. High Performance Computing in Science and Engineering. HPCSE 2017. Lecture Notes in Computer Science(), vol 11087. Springer, Cham. https://doi.org/10.1007/978-3-319-97136-0_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-97136-0_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-97135-3
Online ISBN: 978-3-319-97136-0
eBook Packages: Computer ScienceComputer Science (R0)