Abstract
Support vector machine (SVM) is a well-known method used for pattern recognition and machine learning. However, training a SVM is very costly in terms of time and memory consumption when the data set is large. In contrast, the SVM decision function is fully determined by a small subset of the training data, called support vectors. Therefore, removing any training samples that are not relevant to support vectors might have no effect on building the proper decision function. In this paper,an effective hybrid method is proposed to remove from the training set the data that is irrelevant to the final decision function, and thus the number of vectors for SVM training becomes small and the training time can be decreased greatly. Experimental results show that a significant amount of training data can be discarded by our methods without compromising the generalization capability of SVM.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Cortes, C., Vapnik, V.: Support-vector network. Machine Learning 20, 273–297 (1995)
Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)
Joachims, T.: Making large-scale SVM learning practical. In: SchÖlkopf, B., Burges, C.J.C., Smola, A.J. (eds.) Advances in Kernel Methods - Support Vector Learning. MIT Press, Cambridge (1998)
Balcazar, J.L., Dai, Y., Watanabe, O.: Provably Fast Training Algorithms for Support Vector Machines. In: Proc. of the 1st IEEE International Conference on Data mining, pp. 43–50. IEEE Computer Society, Los Alamitos (2001)
Agarwal, D.K.: Shrinkage estimator generalizations of proximal support vector machines. In: Proc. of the 8th ACM SIGKDD international conference of knowledge Discovery and data mining, Edmonton, Canada (2002)
Yu, H., Yang, J., Han, J.: Classifying large data sets using svms with hierarchical clusters. In: Proc. ACM SIGKDD, pp. 306–315 (2003)
Daniael, B., Cao, D.: Training Support Vector Machines Using Adaptive Clustering. In: Proc. Of SIAM International Conference on Data Mining 2004, Lake Buena Vista, FL, USA (2004)
Valentini, G., Dietterich, T.G.: Low Bias Bagged Support Vector Machines. In: Proc. of the 20tth International Conference on Machine Learning ICML 2003, Washington D.C., USA, pp. 752–759 (2003)
Shih, L., Rennie, Y.D.M., Chang, Y., Karger, D.R.: Text Bundling: Statistics-based Data Reduction. In: Proc. of the Twentieth International Conference on Machine Learning (ICML 2003), Washington DC (2003)
Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines (2001), Software available at: http://www.csie.ntu.edu.tw/~cjlin/libsvm
Murphy, P.M., Aha, D.W.: UCI repository of machine learning databases, Irvine, CA (1994), Available at: http://www.ics.uci.edu/~mlearn/MLRepository.html
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zeng, ZQ., Gao, J., Guo, H. (2006). A Hybrid Method for Speeding SVM Training. In: Etzion, O., Kuflik, T., Motro, A. (eds) Next Generation Information Technologies and Systems. NGITS 2006. Lecture Notes in Computer Science, vol 4032. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11780991_27
Download citation
DOI: https://doi.org/10.1007/11780991_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-35472-7
Online ISBN: 978-3-540-35473-4
eBook Packages: Computer ScienceComputer Science (R0)