This paper proposes a speaker recognition technique using multiple model structures based on the Bayesian approach. In recent speaker recognition, many sophisticated statistical models have been proposed, e.g., Joint Factor Analysis and i-Vector based method. However, since most of them are based on Gaussian Mixture Models (GMMs), therefore improving estimation accuracy of generative models, i.e. GMMs, with limited amount of training data is still an important problem in speaker recognition. For this purpose, a Bayesian approach which marginalizes all possible model parameters has been applied to the GMM based speaker recognition. This paper extends it to the model structure marginalization. The proposed method can improve the estimation accuracy by integrating multiple GMMs with different numbers of mixtures within the Bayesian framework. Experimental results show that the proposed method improved the identification rates from the conventional method using a single model structure.
Index Terms: speaker recognition, GMM, Bayesian approach, model structure