Acoustic Modelling Using Continuous Rational Kernels | Journal of Signal Processing Systems Skip to main content
Log in

Abstract

Many discriminative classification algorithms are designed for tasks where samples can be represented by fixed-length vectors. However, many examples in the fields of text processing, computational biology and speech recognition are best represented as variable-length sequences of vectors. Although several dynamic kernels have been proposed for mapping sequences of discrete observations into fixed-dimensional feature-spaces, few kernels exist for sequences of continuous observations. This paper introduces continuous rational kernels, an extension of standard rational kernels, as a general framework for classifying sequences of continuous observations. In addition to allowing new task-dependent kernels to be defined, continuous rational kernels allow existing continuous dynamic kernels, such as Fisher and generative kernels, to be calculated using standard weighted finite-state transducer algorithms. Preliminary results on both a large vocabulary continuous speech recognition (LVCSR) task and the TIMIT database are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. L.A. Rabiner, “A Tutorial on Hidden Markov Models and Selective Applications in Speech Recognition,” in Proc. of the IEEE, vol. 77, 1989, pp. 257-286, February.

  2. D. Povey, Discriminative Training for Large Vocabulary Speech Recognition, Ph.D. thesis, University of Cambridge, July 2004.

  3. V.N. Vapnik, Statistical Learning Theory, Wiley, 1998.

  4. H. Lodhi, C. Saunders, J. Shawe-Taylor, N. Cristianini, and C. Watkins, “Text Classification Using String Kernels,” J. Mach. Learn. Res., vol. 2, 2002, pp. 419–444.

    Article  MATH  Google Scholar 

  5. K. Tsuda, T. Kin, and K. Asai, “Marginalized Kernels for Biological Sequences,” Bioinformatics, vol. 18, 2002, pp. S268–S275.

    Google Scholar 

  6. T. Jaakkola and D. Hausser, “Exploiting Generative Models in Disciminative Classifiers,” in Advances in Neural Information Processing Systems 11, S.A. Solla and D.A. Cohn (Eds.), MIT, 1999, pp. 487–493.

  7. N. Smith and M. Gales, “Speech Recognition using SVMs,” in Advances in Neural Information Processing Systems 14, T.G. Dietterich, S. Becker, and Z. Ghahramani (Eds.), MIT, 2002, pp. 1197–1204.

  8. C. Cortes, P. Haffner, and M. Mohri, “Positive Definite Rational Kernels,” in 16th Annual Conference on Computational Learning Theory (COLT 2003), Washington DC, August 2003, pp. 656–670.

  9. C. Cortes, P. Haffner, and M. Mohri, “Rational Kernels: Theory and Algorithms,” J. Mach. Learn. Res., vol. 5, 2004, pp. 1035–1062.

    MathSciNet  Google Scholar 

  10. M. Mohri, F. Pereira, and M. Riley, “Weighted Finite-state Transducers in Speech Recognition,” Comput. Speech Lang., vol. 16, 2002, pp. 69–88, January.

    Article  Google Scholar 

  11. F.C.N. Pereira and M.D. Riley, “Speech Recognition by Composition of Weighted Finite Automata,” in Finite-State Devices for Natural Language Processing, E. Roche and Y. Schabes (Eds.), MIT, 1997.

  12. J.S. Garofolo et al., DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus CD-ROM, 1993.

  13. N.D. Smith and M.J.F. Gales, “Using SVMs to Classify Variable Length Speech Patterns,” Tech. Rep. CUED/F-INFENG/TR.412, Department of Engineering, University of Cambridge, April 2002.

  14. M.I. Layton, Augmented Statistical Models for Classifying Sequence Data, Ph.D. thesis, University of Cambridge, September 2006.

  15. F. Rosenblatt, “The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain,” Psychol. Rev., vol. 65, no. 6, 1958, pp. 386–408.

    Article  MathSciNet  Google Scholar 

  16. V. Venkataramani, S. Chakrabartty, and W. Byrne, “Support Vector Machines for Segmental Minimum Bayes Risk Decoding of Continuous Speech,” in ASRU 2003, 2003, pp. 13–18.

  17. M. Mohri, “Finite-state Transducers in Language and Speech Processing,” Comput. Linguist., vol. 23, no. 2, 1997, pp. 269–311.

    MathSciNet  Google Scholar 

  18. M. Mohri, “Semiring Frameworks and Algorithms for Shortest-distance Problems,” J. Autom. Lang. Comb., vol. 7, 2002, pp. 321–350.

    MATH  MathSciNet  Google Scholar 

  19. J. Shawe-Taylor and N. Cristianini, Kernel Methods for Pattern Analysis, Cambridge University Press, 2004.

  20. L. E. Baum and J. A. Eagon, “An Inequality with Applications to Statistical Estimation for Probabilistic Functions of Markov Processes and to a Model for Ecology,” Bull. Am. Math. Soc., vol. 73, 1967, pp. 360–363.

    Article  MATH  MathSciNet  Google Scholar 

  21. L.R. Bahl, P. Brown, P. de Souza, and R. Mercer, “Maximum Mutual Information Estimation of Hidden Markov Model Parameters for Speech Recognition,” in Proc. ICASSP, Tokyo, 1986.

  22. O. Cappé, E. Moulines, and T. Rydén, Inference in Hidden Markov Models, Springer, 2005, Springer Series in Statistics.

  23. G. Evermann, H.Y. Chan, M.J.F. Gales, B. Jia, D. Mrva, P.C. Woodland, and K. Yu, “Training LVCSR Systems on Thousands of Hours of Data,” in Proc. ICASSP, 2005, pp. 209–212.

  24. L. Mangu, E. Brill, and A. Stolcke, “Finding Consensus among Words: Lattice-based Word Error Minimization,” in Proc. Eurospeech, 1999, pp. 495–498.

  25. N.D. Smith, Using Augmented Statistical Models and Score Spaces for Classification, Ph.D. thesis, University of Cambridge, September 2003.

  26. A. Gunawardana, M. Mahajan, A. Acero, and J.C. Platt, “Hidden Conditional Random Fields for Phone Classification,” in Interspeech, 2005.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Martin Layton.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Layton, M., Gales, M. Acoustic Modelling Using Continuous Rational Kernels. J VLSI Sign Process Syst Sign Im 48, 67–82 (2007). https://doi.org/10.1007/s11265-006-0027-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-006-0027-4

Keywords

Navigation