Abstract
We describe an algorithm to measure the similarity between sentences, integrating the edit distance between trees and single-term similarity techniques, and also allowing the pattern to be defined approximately, omitting some structural details. A technique of this kind is of interest in a variety of applications, such as information extraction/retrieval or question answering, where error-tolerant recognition allows incomplete sentences to be integrated in the computation process.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Lin, D.: An information-theoretic definition of similarity. In: Proc. of 15th Int. Conf. on Machine Learning, pp. 296–304 (1998)
Smeaton, A.F., O’Donell, R., Kelley, F.: Indexing Structures Derived from Syntax in TREC-3: System Description. In: Proc. of 3rd Text REtrieval Conference (1994)
Smeaton, A.F., Quigley, I.: Experiments on using semantic distances between words in image caption retrieval. In: Proc. of the 19th Annual Int. ACM Conf. on Research and Development in Information Retrieval, pp. 174–180 (1996)
Wagner, R.A., Fischer, M.J.: The string to string correction problem. Journal of the ACM 21(1), 168–173 (1974)
Zhang, K., Shasha, D., Wang, J.T.L.: Approximate tree matching in the presence of variable length don’t cares. Journal of Algorithms 16(1), 33–66 (1994)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ribadas, F.J., Vilares, M., Vilares, J. (2005). Semantic Similarity Between Sentences Through Approximate Tree Matching. In: Marques, J.S., Pérez de la Blanca, N., Pina, P. (eds) Pattern Recognition and Image Analysis. IbPRIA 2005. Lecture Notes in Computer Science, vol 3523. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11492542_78
Download citation
DOI: https://doi.org/10.1007/11492542_78
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26154-4
Online ISBN: 978-3-540-32238-2
eBook Packages: Computer ScienceComputer Science (R0)