Abstract
Topics that attract public attention can originate from current events or developments, might be influenced by situations in the past, and often continue to be of interest in the future. When respective information is made available textually, one possibility of detecting such topics of public importance consists in scrutinizing, e.g., appropriate press articles using—given the continual growth of information—text processing techniques enriched by computer routines which examine present-day textual material, check historical publications, find newly emerging topics, and are able to track topic trends over time. Information clustering based on content-(dis)similarity of the underlying textual material and graph-theoretical considerations to deal with the network of relationships between content-similar topics are described and combined in a new approach. Explanatory examples of topic detection and tracking in online news articles illustrate the usefulness of the approach in different situations.
Similar content being viewed by others
References
Allan J (2002a) Detection as multi-topic tracking. Inf Retr 5(2–3):139–157
Allan J (2002b) Introduction to topic detection and tracking. In: Allan J (ed) Topic detection and tracking. Kluwer Academic Publishers, Norwell, pp 1–16
Allan J (ed) (2002c) Topic detection and tracking: event-based information organization. Kluwer Academic Publishers, Norwell
Allan J, Carbonell J, Doddington G, Yamron J, Yang Y (1998) Topic detection and tracking pilot study: final report. In: Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop. Lansdowne, VA, USA, pp 194–218
Allan J, Lavrenko V, Swan R (2002) Exploration within topic tracking and detection. In: Allan J (ed) Topic detection and tracking. Kluwer Academic Publishers, Norwell, pp 197–224
Benhardus J (2010) Streaming trend detection in Twitter. In: UCCS REU for Artificial Intelligence, Natural Language Processing and Information Retrieval, Final Report
Bock HH (1974) Automatische Klassifikation. Theoretische und praktische Methoden zur Gruppierung und Strukturierung von Daten (Cluster-Analyse). Vandenhoeck & Ruprecht, Göttingen
Bock HH (1980) Clusteranalyse—Überblick und neuere Entwicklungen. Oper Res Spektrum 1(4):211–232
Brandes U, Erlebach T (eds) (2005) Network analysis: methodological foundations, vol 3418. Lecture Notes in Computer Science. Springer-Verlag New York Inc, Secaucus
Bun KK, Ishizuka M (2006) Emerging topic tracking system in WWW. Knowl Based Syst 19(3):164–171
Gaul W (2011) Web page importance ranking. Adv Data Anal Classif 5:113–128
Jin Y, Myaeng SH, Jung Y (2007) Use of place information for improved event tracking. Inf Process Manage 43(2):365–378
Khy S, Ishikawa Y, Kitagawa H (2008) A novelty-based clustering method for on-line documents. World Wide Web 11(1):1–37
Kim P, Myaeng SH (2004) Usefulness of temporal information automatically extracted from news articles for topic tracking. ACM Trans Asian Lang Inf Process 3(4):227–242
Kupietz M, Keibel H (2009) The Mannheim German reference corpus (DeReKo) as a basis for empirical linguistic research. In: Minegishi M, Kawaguchi Y (eds) Working Papers in Corpus-Based Linguistics and Language Education, Tokyo University of Foreign Studies (TUFS), 3, pp 53–59
Kupietz M, Belica C, Keibel H, Witt A (2010) The German reference corpus DeReKo: A primordial sample for linguistic research. In: Calzolari N, Choukri K, Maegaard B, Mariani J, Odijk J, Piperidis S, Rosner M, Tapias D (eds) Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10). European Language Resources Association (ELRA), Valletta, Malta
Li B, Li W, Lu Q (2006) Topic tracking with time granularity reasoning. ACM Trans Asian Lang Inf Process 5(4):388–412
Mathioudakis M, Koudas N (2010) Twittermonitor: trend detection over the twitter stream. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, ACM, New York, NY, USA, SIGMOD ’10, pp 1155–1158
Mei Q, Liu C, Su H, Zhai C (2006) A probabilistic approach to spatiotemporal theme pattern mining on weblogs. In: Proceedings of the 15th International Conference on World Wide Web, ACM, New York, NY, USA, WWW ’06, pp 533–542
Oard DW (1999) Topic tracking with the prise information retrieval system. In: Proceedings of the DARPA Broadcast News Workshop, pp 209–211
Oliveira M, Gama J (2010) Bipartite graphs for monitoring clusters transitions. In: Cohen P, Adams N, Berthold M (eds) Advances in intelligent data analysis IX, vol 6065., Lecture Notes in Computer Science. Springer, Berlin Heidelberg, pp 114–124
Pons-Porrata A, Berlanga-Llavori R, Ruiz-Shulcloper J (2002) On-line event and topic detection by using the compact sets clustering algorithm. J Intell Fuzzy Syst 12(3,4):185–194
Rajaraman K, Tan AH (2001) Topic detection, tracking, and trend analysis using self-organizing neural networks. In: Proceedings of the 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer-Verlag, London, UK, UK, PAKDD ’01, pp 102–107
Salton G (1989) Automatic text processing: the transformation, analysis, and retrieval of information by computer. Addison-Wesley Longman Publishing Co., Inc, Boston
Steiner T, van Hooland S, Summers E (2013) MJ no more: using concurrent wikipedia edit spikes with social network plausibility checks for breaking news detection. Computing Research Repository. arXiV:1303.4702
Tu YN, Seng JL (2012) Indices of novelty for emerging topic detection. Inf Process Manage 48(2):303–325
Walls F, Jin H, Sista S, Schwartz R (1999) Topic detection in broadcast news. In: Proceedings of the DARPA Broadcast News Workshop, Morgan Kaufmann Publishers, Inc, pp 193–198
Wayne CL (1998) Topic detection and tracking (tdt)—overview and perspective. In: DARPA Broadcast News Transcription and Understanding Workshop, Lansdowne Conference Resort, Lansdowne Virginia
Wei CP, Lee YH (2004) Event detection from online news documents for supporting environmental scanning. Decis Support Syst 36(4):385–401
Yang C, Shi X, Wei CP (2009) Discovering event evolution graphs from news corpora. IEEE Trans Syst Man Cybern Part A Syst Hum 39(4):850–863
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Gaul, W., Vincent, D. Evaluation of the evolution of relationships between topics over time. Adv Data Anal Classif 11, 159–178 (2017). https://doi.org/10.1007/s11634-016-0241-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11634-016-0241-2
Keywords
- Topic relationships
- Topic trend detection
- Text processing
- Content-(dis)similarity
- Information clustering