Abstract
With the ever-growing amount of unstructured textual data on the web, mining these text collections is of increasing importance for the understanding of document archives. Particularly the self-organizing map has shown to be very well suited for this task. However, the interpretation of the resulting document maps still requires a tremendous effort, especially as far as the analysis of the features learned and the characteristics of identified text clusters are concerned. In this paper we present the LabelSOM method which, based on the features learned by the map, automatically assigns a set of keywords to the units of the map to describe the concepts of the underlying text clusters, thus making the characteristics of the various topical areas on the map explicit.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Kaski, S., Honkela, T., Lagus, K., Kohonen, T.: WEBSOM—self-organizing maps of document collections. Elsevier Publications, Amsterdam (1997)
Kohonen, T.: Self-Organizing Maps. Springer, Berlin (1995)
Merkl, D.: Text classification with self-organizing maps: Some lessons learned. Neurocomputing 21 (1–3) (1998)
Merkl, D., Rauber, A.: Alternative ways for cluster visualization in self-organizing maps. In: Proc. of the Workshop on Self-Organizing Maps (WSOM 1997), Helsinki, Finland (1997)
Rauber, A., Merkl, D.: Creating an order in distributed digital libraries by integrating independent self-organizing maps. In: Proc. Int’l Conf. on Artificial Neural Networks (ICANN 1998), Skóvde, Sweden (1998)
Rauber, A., Merkl, D.: The SOMLib digital library system. In: Abiteboul, S., Vercoustre, A.-M. (eds.) ECDL 1999. LNCS, vol. 1696, p. 323. Springer, Heidelberg (1999)
Salton, G.: Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley, Reading (1989)
Ultsch, A.: Self-organizing neural networks for visualization and classification. Information and Classification. Concepts, Methods and Application (1993)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rauber, A., Merkl, D. (1999). Mining Text Archives: Creating Readable Maps to Structure and Describe Document Collections. In: Żytkow, J.M., Rauch, J. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 1999. Lecture Notes in Computer Science(), vol 1704. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-48247-5_68
Download citation
DOI: https://doi.org/10.1007/978-3-540-48247-5_68
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66490-1
Online ISBN: 978-3-540-48247-5
eBook Packages: Springer Book Archive