ISCA Archive - Voice activity detection based on conditional random fields using multiple features
ISCA Archive Interspeech 2010
ISCA Archive Interspeech 2010

Voice activity detection based on conditional random fields using multiple features

Akira Saito, Yoshihiko Nankaku, Akinobu Lee, Keiichi Tokuda

This paper proposes a Voice Activity Detection (VAD) algorithm based on Conditional Random Fields (CRF) using multiple features. VAD is a technique to distinguish between speech and non-speech in noisy environments and an important component in many real-world speech applications. In the proposed method,the posterior probability of output labels is directly modeled by the weighted sum of the feature functions. By estimating appropriate weight parameters, effective features are automatically selected for improving the performance for VAD. Experimental results on CENSREC-1-C database show that the proposed method can decrease error rates by using conditional random fields.