Abstract
Automatic text categorisation of documents has received a resounding interest in last years due to the increased availability of documents in digital form and the commanding need to organize them. In this paper, our main focus is the development of tools that will enable very fast and accurate text classifiers in large scale databases. To pursue this objective, we start by introducing the main issues of text categorisation and present possible ways of handling them. Kernel based methods, such as, Support Vector Machines (SVMs), are learning methods with strong potential for solving the tasks involved in automatic text categorisation. The first results achieved with Reuters-21578 collection are reported and some points of possible improvements are identified.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
S. Dumais, J. Piatt, D. Heckerman, “Inductive Learning Algorithms and Representations for Text categorisation”, in Proceedings of the ACM-CIKM98, pp. 148–155, 1998.
V. Vapnik, “The Nature of Statistical Learning Theory”, 2nd edition, Springer, 1999.
F. Sebastiani, “A Tutorial on Automated Text categorisation”, in Analia Amandi and Alejandro Zunino (eds.), Proceedings of ASAI-99, 1st Argentinian Symposium on Artificial Intelligence, Buenos Aires, AR, pp. 7–35, 1999.
J. Kwok, “Automated Text categorisation Using Support Vector Machine”, in Proceedings of the International Conference on Neural Information Processing (ICONIP’98), pp. 347–351, Kitakyushu, Japan, 1998.
B. Schölkopf, C. Burges, A. Smola, “Advances in Kernel Methods - Introduction to Support vector Learning”, MIT Press, pp. 1–15, 1999.
T. Joachims, “Text categorisation with Support Vector Machines: Learning with Many Relevant Features”, in Proceedings of the European Conference on Machine Learning (ECML), Springer, pp. 137–142, Berlin, 1998.
T. Joachims, “Learning to Classify Text Using Support Vector Machines - Methods, Theory and Algorithms”, in The Kluwer International Series in Engineering and Computer Science, Kluwer Academic Publishers, 2001.
S. Gunn, “Support Vector Machines for Classification and Regression”, Technical Report, Faculty of Engineering and Applied Science, Department of Electronics and Computer Science, 1998.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Wien
About this paper
Cite this paper
Silva, C., Ribeiro, B. (2003). An Inductive Inference Approach to Large Scale Text Categorisation. In: Pearson, D.W., Steele, N.C., Albrecht, R.F. (eds) Artificial Neural Nets and Genetic Algorithms. Springer, Vienna. https://doi.org/10.1007/978-3-7091-0646-4_24
Download citation
DOI: https://doi.org/10.1007/978-3-7091-0646-4_24
Publisher Name: Springer, Vienna
Print ISBN: 978-3-211-00743-3
Online ISBN: 978-3-7091-0646-4
eBook Packages: Springer Book Archive