Abstract
With more and more hypertext documents available online, hypertext classification has become one popular research topic in information retrieval. Hyperlinks, HTML tags and category labels distributed over linked documents provide rich classification information. Integrating these information and content tfidf result as document feature vector, this paper proposes a new weighted hyper-sphere support vector machine for hypertext classification. Based on eliminating the influence of the uneven class sizes with weight factors, the new method solves multi-class classification with less computational complexity than binary support vector machines. Experiments on benchmark data set verify the efficiency and feasibility of our method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Yi, Y.M., Sean, S., Rayid, G.: A Study of Approaches for Hypertext Categorization. Journal of Intelligent Information Systems 18, 219–241 (2002)
Hong, L.: Learning Text Classification Rules From Labeled and Unlabeled Examples. Dissertation, Shanghai Jiao Tong University (2003)
Joachims, T.: Make Large-scale Support Vector Machine Learning Practical. In: Scholkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods-Support Vector Learning, pp. 185–208. MIT Press, Cambridge (1999)
Zhu, M.L., Chen, S.F., Liu, X.D.: Sphere-structured Support Vector Machines for Multi-class Pattern Recognition. LNCS, vol. 2369, pp. 589–593. Springer, Heidelberg (2003)
Salton, G., Buckley, C.: Term Weighting Approaches in Automatic Text Retrieval. Information Processing and Management 24(5), 513–523 (1998)
The 4 Universities Data Set, http://www.cs.cmu.edu/afs/cs/project/theo-20/www-/data
Slattery, S., Mitchell, T.: Discovering Test Set Regularities in Relational Domains. In: 17th International Conference on Machine Learning (ICML 2000), pp. 895–902. Morgan Kaufmann, Stanford (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liu, S., Shi, G. (2008). Weighted Hyper-sphere SVM for Hypertext Classification. In: Sun, F., Zhang, J., Tan, Y., Cao, J., Yu, W. (eds) Advances in Neural Networks - ISNN 2008. ISNN 2008. Lecture Notes in Computer Science, vol 5263. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87732-5_82
Download citation
DOI: https://doi.org/10.1007/978-3-540-87732-5_82
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87731-8
Online ISBN: 978-3-540-87732-5
eBook Packages: Computer ScienceComputer Science (R0)