Abstract
Lexicon-based handwritten text keyword spotting (KWS) has proven to be a very fast and accurate alternative to lexicon-free methods. Nevertheless, since lexicon-based KWS methods rely on a predefined vocabulary, fixed in the training phase, they perform poorly for any query keyword that was not included in it (i.e. out-of-vocabulary keywords). This turns the KWS system useless for that particular type of queries. In this paper, we present a new way of smoothing the scores of OOV keywords, and we compare it with previously published alternatives on different data sets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Fischer, A., Keller, A., Frinken, V., Bunke, H.: Lexicon-free handwritten word spotting using character HMMs. Pattern Recogn. Lett. 33(7), 934–942 (2012). special Issue on Awards from ICPR 2010
Frinken, V., Fischer, A., Manmatha, R., Bunke, H.: A novel word spotting method based on recurrent neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 34(2), 211–224 (2012)
Kneser, R., Ney, H.: Improved backing-off for N-gram language modeling. In: International Conference on Acoustics. Speech and Signal Processing (ICASSP 1995), vol. 1, pp. 181–184. IEEE Computer Society, Los Alamitos (1995)
Manning, C.D., Raghavan, P., Schtze, H.: Introduction to Information Retrieval. Cambridge University Press, New York (2008)
Puigcerver, J., Toselli, A.H., Vidal, E.: Word-graph and character-lattice combination for KWS in handwritten documents. In: 14th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 181–186 (2014)
Puigcerver, J., Toselli, A.H., Vidal, E.: Word-graph-based handwriting keyword spotting of out-of-vocabulary queries. In: 22nd International Conference on Pattern Recognition (ICPR), pp. 2035–2040 (2014)
Robertson, S.: A new interpretation of average precision. In: Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2008), pp. 689–690. ACM, New York (2008)
Rodriguez-Serrano, J.A., Perronnin, F.: Handwritten word-spotting using hidden markov models and universal vocabularies. Pattern Recogn. 42(9), 2106–2116 (2009). http://www.sciencedirect.com/science/article/pii/S0031320309000673
Shang, H., Merrettal, T.: Tries for approximate string matching. IEEE Transac. Knowl. Data Eng. 8(4), 540–547 (1996)
Toselli, A.H., Vidal, E.: Fast HMM-filler approach for key word spotting in handwritten documents. In: Proceedings of the 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 501–505 (2013)
Toselli, A.H., Vidal, E., Romero, V., Frinken, V.: Word-graph based keyword spotting and indexing of handwritten document images. Universitat Politcnica de Valncia, Technical report (2013)
Woodland, P., Leggetter, C., Odell, J., Valtchev, V., Young, S.: The 1994 HTK large vocabulary speech recognition system. In: International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1995), vol. 1, pp. 73–76, May 1995
Acknowledgments
This work was partially supported by the Spanish MEC under FPU grant FPU13/06281 and under the STraDA research project (TIN2012-37475-C02-01), by the Generalitat Valenciana under the grant Prometeo/2009/014, and through the EU 7th Framework Programme grant tranScriptorium (Ref: 600707).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Puigcerver, J., Toselli, A.H., Vidal, E. (2015). A New Smoothing Method for Lexicon-Based Handwritten Text Keyword Spotting. In: Paredes, R., Cardoso, J., Pardo, X. (eds) Pattern Recognition and Image Analysis. IbPRIA 2015. Lecture Notes in Computer Science(), vol 9117. Springer, Cham. https://doi.org/10.1007/978-3-319-19390-8_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-19390-8_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19389-2
Online ISBN: 978-3-319-19390-8
eBook Packages: Computer ScienceComputer Science (R0)