Abstract
The multi-document summarizer using genetic algorithm-based sentence extraction (SBGA) regards summarization process as an optimization problem where the optimal summary is chosen among a set of summaries formed by the conjunction of the original articles sentences. To solve the NP hard optimization problem, SBGA adopts genetic algorithm, which can choose the optimal summary on global aspect. To improve the accuracy of term frequency, SBGA employs a novel method TFS, which takes word sense into account while calculating term frequency. The experiments on DUC04 data show that our strategy is effective and the ROUGE-1 score is only 0.55% lower than the best participant in DUC04.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Radev, D., Jing, H.Y., Budzikowska, M.: Centroid-Based Summarization of Multiple Documents: Sentence Extraction, Utility-Based Evaluation and User Studies. Information Processing and Management 40(6), 919–938 (2004)
Knight, K., Marcu, D.: Summarization Beyond Sentence Extraction: a Probabilistic Approach to Sentence Compression. Artificial Intelligence 139(1), 91–107 (2002)
Barzilay, R., McKeown, K.R., Michael, E.: Information Fusion in the Context of Multi-Document Summarization. In: The 37th Annual Meeting of the Association for Computational Linguistics, pp. 550–557. Association for Computational Linguistics, New Jersey (1999)
MAN‘A-LO‘PEZ, Manuel, J.: Multi-document Summarization: An Added Value to Clustering in Interactive Retrieval. ACM Transactions on Information Systems 22(2), 215–241 (2004)
Goldberg, D.E.: Genetic Algorithms in Search Optimization and Machine Learning. Addision Wesley, New York (1989)
Baeza, Y.R., Ribeiro, N.B.: Modern Information Retrieval, pp. 27–30. Addison Wesley, New York (1999)
Jaoua, Kallel F., Jaoua, M.: Summarization at LARIS Laboratory (2004), http://duc.nist.gov/pubs/2004papers/larislab2.jaoua.pdf
Matthew, W.: GAlib: A C++ Library of Genetic Algorithm Components (1996), http://lancet.mit.edu/ga/
Lin, C., Hovy, E.: Automatic Eevaluation of Summaries Using N-gram Co-occurrence Statistics (2003), http://www.isi.edu/~cyl/papers/NAACL2003.pdf
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liu, D., He, Y., Ji, D., Yang, H. (2006). Genetic Algorithm Based Multi-document Summarization. In: Yang, Q., Webb, G. (eds) PRICAI 2006: Trends in Artificial Intelligence. PRICAI 2006. Lecture Notes in Computer Science(), vol 4099. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-36668-3_149
Download citation
DOI: https://doi.org/10.1007/978-3-540-36668-3_149
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-36667-6
Online ISBN: 978-3-540-36668-3
eBook Packages: Computer ScienceComputer Science (R0)