default search action
ISCSLP 2014: Singapore
- Minghui Dong, Jianhua Tao, Haizhou Li, Thomas Fang Zheng, Yanfeng Lu:
The 9th International Symposium on Chinese Spoken Language Processing, Singapore, September 12-14, 2014. IEEE 2014, ISBN 978-1-4799-4220-6 - Shaofei Xue, Hui Jiang, Li-Rong Dai:
Speaker adaptation of hybrid NN/HMM model for speech recognition based on singular value decomposition. 1-5 - Xin Chen, Jian Cheng:
Deep neural network acoustic modeling for native and non-native Mandarin speech recognition. 6-9 - Xiangang Li, Xihong Wu:
Labeling unsegmented sequence data with DNN-HMM and its application for speech recognition. 10-14 - Xinhui Hu, Xugang Lu, Chiori Hori:
Mandarin speech recognition using convolution neural network with augmented tone features. 15-18 - Yuan Ma, Jianwu Dang, Weifeng Li:
Research on deep neural network's hidden layers in phoneme recognition. 19-23 - Bin Wang, Zhijian Ou, Jian Li, Akinori Kawamura:
Joint-character-POC N-gram language modeling for Chinese speech recognition. 24-28 - Gang Sun, Zhiwei Jiang, Qing Gu, Daoxu Chen:
Linear model incorporating feature ranking for Chinese documents readability. 29-33 - Jen-Tzung Chien, Yuan-Chu Ku, Mou-Yue Huang:
Rapid bayesian learning for recurrent neural network language model. 34-38 - Zhiyang He, Ping Lv, Ji Wu:
Minimum classification error rate training of supervised topic mixture model for multi-label text categorization. 39-43 - Chongjia Ni, Cheung-Chi Leung:
Investigation of using different Chinese word segmentation standards and algorithms for automatic speech recognition. 44-48 - Xiaohao Yang, Jia Liu:
Deep belief network based CRF for spoken language understanding. 49-53 - Liping Chen, Kong-Aik Lee, Bin Ma, Wu Guo, Haizhou Li, Li-Rong Dai:
Local variability vector for text-independent speaker verification. 54-58 - Ikuya Hirano, Kong-Aik Lee, Zhaofeng Zhang, Longbiao Wang, Atsuhiko Kai:
Single-sided approach to discriminative PLDA training for text-independent speaker verification without using expanded i-vector. 59-63 - Wei Rao, Man-Wai Mak:
Relevance vector machines with empirical likelihood-ratio kernels for PLDA speaker verification. 64-68 - Rong Zheng, Bo Xu:
Data-driven tree structure based UBM reconstruction for speaker verification. 69-72 - Shanshan Zhang, Rong Zheng, Bo Xu:
An iVector extractor using pre-trained neural networks for speaker verification. 73-77 - Wenbo Liu, Zhiding Yu, Ming Li:
An iterative framework for unsupervised learning in the PLDA based speaker verification. 78-82 - Changqing Kong, Shaofei Xue, Jianqing Gao, Wu Guo, Li-Rong Dai, Hui Jiang:
Speaker adaptive bottleneck features extraction for LVCSR based on discriminative learning of speaker codes. 83-87 - Yi Liu, Xiangang Li, Xihong Wu:
Error-driven pronunciation dictionary construction for Mandarin speech recognition. 88-92 - Yunxin Zhao, Tuo Zhao, Xin Chen:
Multilevel sampling and aggregation for discriminative training. 93-97 - Tuo Zhao, Yunxin Zhao, Xin Chen:
Building an ensemble of CD-DNN-HMM acoustic model using random forests of phonetic decision trees. 98-102 - Tom Ko, Brian Kan-Wing Mak, Dongpeng Chen:
Modeling inter-cluster and intra-cluster discrimination among triphones. 103-107 - Shota Morita, Masashi Unoki, Xugang Lu, Masato Akagi:
Robust voice activity detection based on concept of modulation transfer function in noisy reverberant environments. 108-112 - Mirco Ravanelli, Van Hai Do, Adam Janin:
TANDEM-bottleneck feature combination using hierarchical Deep Neural Networks. 113-117 - Xiang Sui, Huiyong Wang, Lan Wang:
A general framework for multi-accent Mandarin speech recognition using adaptive neural networks. 118-122 - Xiangang Li, Xihong Wu:
Decision tree based state tying for speech recognition using DNN derived embeddings. 123-127 - Jianwei Niu, Yanmin Qian, Kai Yu:
Acoustic emotion recognition using deep neural network. 128-132 - Meng Cai, Yongzhe Shi, Jian Kang, Jia Liu, Tengrong Su:
Convolutional maxout neural networks for low-resource speech recognition. 133-137 - Chongjia Ni, Nancy F. Chen, Bin Ma:
Multiple time-span feature fusion for deep neural network modeling. 138-142 - Bing Jiang, Yan Song, Si Wei, Meng-Ge Wang, Ian McLoughlin, Li-Rong Dai:
Performance evaluation of deep bottleneck features for spoken language identification. 143-147 - Wei-Wei Liu, Wei-Qiang Zhang, Jia Liu:
Discriminative boosting regression backend for phonotactic language recognition. 148-152 - Wei-Wei Liu, Meng Cai, Hua Yuan, Xiao-Bei Shi, Wei-Qiang Zhang, Jia Liu:
Phonotactic language recognition based on DNN-HMM acoustic model. 153-157 - Yannan Wang, Jun Du, Li-Rong Dai, Chin-Hui Lee:
A fusion approach to spoken language identification based on combining multiple phone recognizers and speech attribute detectors. 158-162 - Zhiyi Li, Wei-Qiang Zhang, Yao Tian, Jia Liu:
A new fast and memory effective i-vector extraction based on factor analysis of KLD derived GMM supervector. 163-167 - Yi Liu, Liang He, Jia Liu:
Improved multitaper PNCC feature for robust speaker verification. 168-172 - Cheng-Tao Chung, Hsin-Kuan Hsiung, Cheng-Kuang Wei, Lin-Shan Lee:
Personalized video summarization based on Multi-Layered Probabilistic Latent Semantic Analysis with shared topics. 173-177 - Ming-Hsiang Su, Yu-Ting Zheng, Chung-Hsien Wu:
Interlocutor personality perception based on BFI profiles and coupled HMMs in a dyadic conversation. 178-182 - Shifu Xiong, Wu Guo, Diyuan Liu:
The Vietnamese speech recognition based on rectified linear units deep neural network and spoken term detection system combination. 183-186 - Zhipeng Chen, Zhiyang He, Ping Lv, Ji Wu:
Improving keyword search by query expansion in a probabilistic framework. 187-191 - I-Fan Chen, Chongjia Ni, Boon Pang Lim, Nancy F. Chen, Chin-Hui Lee:
A novel keyword+LVCSR-filler based grammar network representation for spoken keyword search. 192-196 - Feng-Long Xie, Yao Qian, Frank K. Soong, Haifeng Li:
Pitch transformation in neural network based voice conversion. 197-200 - Yu-Sheng Sun, Zhen-Hua Ling, Xiang Yin, Li-Rong Dai:
Integrating global variance of log power spectrum derived from LSPs into MGE training for HMM-based parametric speech synthesis. 201-205 - Jingjie Li, Ian Vince McLoughlin, Yan Song:
Reconstruction of pitch for whisper-to-speech conversion of Chinese. 206-210 - Xiaohai Tian, Zhizheng Wu, Siu Wa Lee, Engsiong Chng:
Correlation-based frequency warping for voice conversion. 211-215 - Xixin Wu, Zhiyong Wu, Jia Jia, Helen M. Meng, Lianhong Cai, Weifeng Li:
Automatic speech data clustering with human perception based weighted distance. 216-220 - Xian Li, Zengfu Wang:
Frame correlation based autoregressive GMM method for voice conversion. 221-225 - Shusen Li, Zhiyang He, Ji Wu:
An ontology semantic tree based natural language interface. 226-230 - Xiaohao Yang, Zhenfeng Chen, Jia Liu:
Word embeddings: A semi-supervised learning method for slot-filling in spoken dialog systems. 231-235 - Dongxia Qian, Yuan Jia, Aijun Li, Liang Xu:
An experimental comparative study on prosodic features between Ningbo EFL learners and American Native speakers - in the case of production of yes-no questions. 236-240 - Shuju Shi, Jinsong Zhang, Yanlu Xie:
Cross-language comparison of F0 range in speakers of native Chinese, native Japanese and Chinese L2 of Japanese: Preliminary results of a corpus-based analysis. 241-244 - Wenping Hu, Yao Qian, Frank K. Soong:
A new Neural Network based logistic regression classifier for improving mispronunciation detection of L2 language learners. 245-249 - Yanhui Tu, Jun Du, Yong Xu, Li-Rong Dai, Chin-Hui Lee:
Speech separation based on improved deep neural networks with dual outputs of speech features for both target and interfering speakers. 250-254 - Kun Li, Helen M. Meng:
Mispronunciation detection and diagnosis in l2 english speech using multi-distribution Deep Neural Networks. 255-259 - Kimiko Tsukada, Hui Ling Xu, Nan Xu Rattanasone:
The perception of Mandarin tones by learners from heritage and non-heritage backgrounds. 260-264 - Liang Zhang, Yuan Jia, Aijun Li:
A preliminary research on rhetorical structural and prosodic features in Chinese reading texts. 265-269 - Jinfu Ni, Yoshinori Shiga, Chiori Hori:
Superpositional HMM-based intonation synthesis using a functional F0 model. 270-274 - Li Gao, Zhen-Hua Ling, Ling-Hui Chen, Li-Rong Dai:
Improving F0 prediction using bidirectional associative memories and syllable-level F0 features for HMM-based Mandarin speech synthesis. 275-279 - Zhengchen Zhang, Minghui Dong:
The power of special characters in prosodicword prediction for Chinese TTS. 280-283 - Hao Liu, Yi Xu:
Learning model-based F0 production through goal-directed babbling. 284-288 - Fei Chen, Kunyu Xu, Gang Peng:
Effects of preceding contexts on the categorical perception of Mandarin tones. 289-293 - Han Yan, Jianwu Dang, Mengxue Cao, Bernd J. Kröger:
A new framework of neurocomputational model for speech production. 294-298 - Dan Zhang, Xianqian Liu, Nan Yan, Lan Wang, Yun Zhu, Hui Chen:
A multi-channel/multi-speaker articulatory database in Mandarin for speech visualization. 299-303 - Shing Yu, Tan Lee, Manwa L. Ng:
Surface electromyographic activity of non-laryngeal neck muscles in Cantonese tone production. 304-307 - Tanvina B. Patel, Hemant A. Patil:
Novel approach for estimating length of the vocal folds using Fujisaki model. 308-312 - Ian Vince McLoughlin, Yan Xu, Yan Song:
Tone confusion in spoken and whispered Mandarin Chinese. 313-316 - Xugang Lu, Yu Tsao, Peng Shen, Chiori Hori:
Spectral patch based sparse coding for acoustic event detection. 317-320 - Ping Yu, Zhijian Wang, Shanshan Liu, Nan Yan, Lan Wang, Manwa L. Ng:
Multidimensional acoustic analysis for voice quality assessment based on the GRBAS scale. 321-325 - Chengran Zhang, Jianwu Dang, Jianan Zhang, Jianguo Wei:
Investigation on articulatory and acoustic characteristics of dysarthria. 326-330 - Feng Huang, Tan Lee:
Multipitch tracking based on linear programming relaxation and sparsity-based pitch candidate estimation. 331-335 - Yong Xu, Jun Du, Li-Rong Dai, Chin-Hui Lee:
Cross-language transfer learning for deep neural network based speech enhancement. 336-340 - Linlin Chao, Jianhua Tao, Minghao Yang, Ya Li:
Improving generation performance of speech emotion recognition by denoising autoencoders. 341-344 - Xin Xu, Ya Li, Xiaoying Xu, Zhengqi Wen, Hao Che, Shanfeng Liu, Jianhua Tao:
Survey on discriminative feature selection for speech emotion recognition. 345-349 - Abdusalam Dawut, Hussein Yusuf, Askar Hamdulla:
The emotion recognition from Uyghur sentences based on combination of Class Discriminating Words and sentiment dictionary. 350 - Seyyare Imam, Rayilam Parhat, Askar Hamdulla, Zhijun Li:
Performance analysis of different keyword extraction algorithms for emotion recognition from Uyghur text. 351 - Youran Lin, Jiangping Kong:
Study of pitch of "dearing" as emotional speech. 352 - Xiaoying Xu, Huimin Wang, Ya Li, Wei Lai, Jianhua Tao:
The expression of emotions by text and speech. 353 - Rui Li, Jun Yu, Chen Jiang, Changwei Luo, Zengfu Wang:
A mass-spring tongue model with efficient collision detection and response during speech. 354-358 - Gaowu Wang, Jianwu Dang, Jiangping Kong:
The modeling of tongue tip in Standard Chinese using MRI. 359 - Feng Yang, Jiangping Kong:
The chest and abdomen breathing in reading literature in Mandarin. 360 - Yinghao Li, Jiangping Kong:
Effects of focal stress on the articulatory and acoustic properties of segments in Standard Chinese. 361 - Xiaosheng Pan, Menghan Zhang, Alan Wee-Chung Liew:
Definition and extraction of lip protrusion based on the facial skeleton data. 362 - Jianan Zhang, Jianguo Wei, Chengran Zhang, Dian Huang, Jianwu Dang:
Visualization of mandarin articulation driven by ultrasound data. 363-366 - Honglin Cao, Jiangping Kong:
Correlations between vocal tract parameters and body heights in adult humans. 367 - Qiang Fang, Jie Liu, Chan Song, Jianguo Wei, Wenhuan Lu:
A novel 3D geometric articulatory model. 368-371 - Xinyuan Zheng, Jianguo Wei, Wenhuan Lu, Qiang Fang, Jianwu Dang:
Mapping between ultrasound and vowel speech using DNN framework. 372-376 - Jianrong Wang, Ju Zhang, Jianguo Wei, Wenhuan Lu, Jianwu Dang:
Automatic speech recognition under robot ego noises. 377 - Hao Chen, Zhenye Gan, Hongwu Yang:
Realizing speech enhancement by combining EEMD and K-SVD dictionary training algorithm. 378 - Yuma Ueda, Longbiao Wang, Atsuhiko Kai, Xiong Xiao, Engsiong Chng, Haizhou Li:
Single-channel dereverberation for distant-talking speech recognition by combining denoising autoencoder and temporal structure normalization. 379-383 - Satoshi Shiota, Longbiao Wang, Kyohei Odani, Atsuhiko Kai, Weifeng Li:
Distant-talking speech recognition using multi-channel LMS and multiple-step linear prediction. 384-388 - Liyang Liu, Zhaogui Ding, Weifeng Li, Longbiao Wang, Qingmin Liao:
Speech enhancement via low-rank matrix decomposition and image based masking. 389-392 - Zhong-Hua Fu, Lei Xie, Hang Lv:
Experimental study on dereverberation and noise reduction for distant speech recognition. 393-397 - Yu-Hao Wu, Jia Jia, Xiu-Long Zhang, Lianhong Cai:
Algorithm of pure tone audiometry based on multiple judgment. 398 - Xiao Chen, Bo Xu:
An improved pitch extraction algorithm for speech processing. 399 - Yifan Guo, Y. X. Zou, Yongqing Wang:
A robust high resolution speaker DOA estimation under reverberant environment. 400 - Shaofei Zhang, Lei Xie, Zhong-Hua Fu, Yougen Yuan:
A hybrid virtual bass system with improved phase vocoder and high efficiency. 401-405 - Yinghao Li, Jinghua Zhang:
An electropalatographic and electroglottographic study on domain-initial strengthening in Korean. 406-410 - Mijit Ablimit, Akbar Pattar, Askar Hamdulla:
Multilayer structure based lexicon optimization for agglutinative languages. 411 - Gulmire Imam, Guljamal Mamateli, Maynur Ablitip, Askar Hamdulla:
Prosody modeling for Uyghur TTS. 412 - Rong Liu, Dong Wang, Chao Xing:
Document classification based on c. 413 - Wei-Qiang Zhang, Junhong Zhao, Wen-Lin Zhang, Jia Liu:
Multi-scale kernels for short utterance speaker recognition. 414-417 - Sheng Zhang, Jie Xu, Wu Guo, Guoping Hu, Xiaokong Ma:
Speaker verification based on SVM and total variability. 418 - Yao Tian, Liang He, Zhiyi Li, Wei-lan Wu, Wei-Qiang Zhang, Jia Liu:
Speaker verification using Fisher vector. 419-422 - Jun Wang, Lantian Li, Dong Wang, Thomas Fang Zheng:
Research on generalization property of time-varying Fbank-weighted MFCC for i-vector based speaker verification. 423 - Yingchun Yang, Licai Deng:
Score regulation based on GMM Token Ratio Similarity for speaker recognition. 424 - Fanhu Bie, Dong Wang, Thomas Fang Zheng:
Research on truncated speech in speaker verification. 425 - Hankiz Yilahun, Gulmire Imam, Maynur Ablitip, Guljamal Mamateli, Askar Hamdulla:
The undulating scale of intonations of exclamatory sentences in Uyghur from the view of experimental phonetics. 426 - Jinghua Zhang, Yinghao Li:
A Proficient trilingual's production of sibilant fricatives of Mandarin Chinese, Korean and English. 427-431 - Mengxue Cao, Aijun Li, Qiang Fang:
GSOM-based modeling study on phoneme acquisition. 432 - Zuyan Wang, Jinsong Zhang:
Influences of vowels on perception of nasal codas in Mandarin for Japanese learners and Chinese. 433 - Richeng Duan, Jinsong Zhang, Yanlu Xie, Wen Cao:
Automatic mispronunciation detection for Mandarin Chinese based on articulation place and articulation manner. 434 - Yanlu Xie, Bei Zhang, Jinsong Zhang:
The training of the tone of Mandarin two-syllable words based on pitch projection synthesis speech. 435 - Xuee Lin, Jian Yang, Juan Zhao:
The text analysis and processing of Thai language text to speech conversion system. 436 - Syed Shahnawazuddin, Rohit Sinha:
A low complexity cluster model interpolation based on-line adaptation technique for spoken query systems. 437-441 - Sheng Li, Yuya Akita, Tatsuya Kawahara:
Corpus and transcription system of Chinese Lecture Room. 442-445 - Zhao You, Bo Xu:
Improving training time of deep neural networkwith asynchronous averaged stochastic gradient descent. 446-449 - Zhao You, Bo Xu:
Investigation of stochastic Hessian-Free optimization in Deep neural networks for speech recognition. 450-453 - Syu-Siang Wang, Payton Lin, Dau-Cheng Lyu, Yu Tsao, Hsin-Te Hwang, Borching Su:
Acoustic feature conversion using a polynomial based feature transferring algorithm. 454-458 - Caixia Gong, Xiangang Li, Xihong Wu:
Recurrent neural network language model with part-of-speech for Mandarin speech recognition. 459-463 - Mohammadi Zaki, Nirmesh J. Shah, Hemant A. Patil:
Effectiveness of fractal dimension for ASR in low resource language. 464-468 - Antoine Laurent, William Hartmann, Lori Lamel:
Unsupervised acoustic model training for the Korean language. 469-473 - Hui Wang, Yue Zhao, Yanmin Xu, Xiaona Xu, Xingmei Suo, Qiang Ji:
Cross-language speech attribute detection and phone recognition for Tibetan using deep learning. 474-477 - Kunxia Wang, Ning An, Lian Li:
Speech emotion recognition based on wavelet packet coefficient model. 478-482 - Hung-Yan Gu, Sung-Fung Tsai:
Improving segmental GMM based voice conversion method with target frame selection. 483-487 - Wei Tong Mok, Rachael Sing, Xiuting Jiang, Swee Lan See:
Investigation of social media on depression. 488-491 - Wenjun Duan, Yuan Jia:
The typology of focus realization of Northern Mandarin. 492-496 - Wei Bao, Ya Li, Mingliang Gu, Jianhua Tao, Linlin Chao, Shanfeng Liu:
Combining prosodic and spectral features for Mandarin intonation recognition. 497-500 - Hao Che, Zhengqi Wen, Ya Li, Jianhua Tao:
Investigating effect of rich syntactic features on Mandarin prosodic phrase boundaries prediction. 501-505 - Shanfeng Liu, Zhengqi Wen, Ya Li, Jianhua Tao, Bin Liu:
Context features based pre-selection and weight prediction in concatenation speech synthesis system. 506-510 - Po-Chun Wang, I-Bin Liao, Chen-Yu Chiang, Yih-Ru Wang, Sin-Horng Chen:
Speaker adaptation of speaking rate-dependent hierarchical prosodic model for Mandarin TTS. 511-515 - Yang Wang, Jianhua Tao:
Evaluation of parameter generation using high order dynamic features and long span windows for HMM based speech synthesis. 516-520 - Hardik B. Sailor, Hemant A. Patil:
Fusion of magnitude and phase-based features for objective evaluation of TTS voice. 521-525 - Nirmesh J. Shah, Hemant A. Patil, Maulik C. Madhavi, Hardik B. Sailor, Tanvina B. Patel:
Deterministic annealing EM algorithm for developing TTS system in Gujarati. 526-530 - Bin Liu, Jianhua Tao, Fuyuan Mo, Ya Li, Zhengqi Wen, Shanfeng Liu:
Efficient voice activity detection algorithm based on sub-band temporal envelope and sub-band long-term signal variability. 531-535 - Honglin Cao, Yingli Wang, Jiangping Kong:
Correlations between body heights and formant frequencies in young male speakers: a pilot study. 536-540 - Anshu Chittora, Hemant A. Patil:
Classification of pathological infant cries using modulation spectrogram features. 541-545 - Ankur G. Undhad, Hemant A. Patil, Maulik C. Madhavi:
Exploiting speech source information for vowel landmark detection for low resource language. 546-550 - Fei Chen, Ada H. Y. Lau:
Effect of vocoder type to Mandarin speech recognition in cochlear implant simulation. 551-554 - Surasak Boonkla, Masashi Unoki, Stanislav S. Makhanov, Chai Wutiwiwatchai:
Speech analysis method based on source-filter model using multivariate empirical mode decomposition in log-spectrum domain. 555-559 - Shota Morita, Xugang Lu, Masashi Unoki:
Signal to noise ratio estimation based on an optimal design of subband voice activity detection. 560-564 - Renbo Zhao, Siu Wa Lee, Dong-Yan Huang, Minghui Dong:
Soft constrained leading voice separation with music score guidance. 565-569 - Ai-Ying Zhang:
Using hierarchical method to improve real time for audio-based surveillance system. 570-573 - Chung-Chien Hsu, Kah-Meng Cheong, Tai-Shih Chi:
A non-uniformly distributed three-microphone array for speech enhancement in directional and diffuse noise field. 574-578 - Shizhe Chen, Qin Jin, Xirong Li, Gang Yang, Jieping Xu:
Speech emotion classification using acoustic features. 579-583 - Kelvin Poon-Feng, Dong-Yan Huang, Minghui Dong, Haizhou Li:
Acoustic emotion recognition based on fusion of multiple feature-dependent deep Boltzmann machines. 584-588 - Steven A. Rieger, Rajani Muraleedharan, Ravi Prakash Ramachandran:
Speech based emotion recognition using spectral feature extraction and an ensemble of kNN classifiers. 589-593 - Li Mei, Jing Zhu:
The discrimination of z-zh, c-ch and s-sh by proficient speakers of Chinese as second language. 594-598 - Xiaoli Feng, Yue Sun, Jinsong Zhang, Yanlu Xie:
A Study on the long-term retention effects of Japanese C2L learners to distinguish Mandarin Chinese Tone 2 and Tone 3 after perceptual training. 599-603 - Zhiyang He, Ping Lv, Ji Wu:
An effective and robust approach to Mandarin spoken language understanding in specific domain. 604-608 - Chunyun Zhang, Weiran Xu, Sheng Gao, Jun Guo:
A bottom-up kernel of pattern learning for relation extraction. 609-613 - Miao Li, Hongyi Ding, Ji Wu:
Global discriminative model for dependency parsing in NLP pipeline. 614-618 - Xiaomin Pang, Man-Wai Mak:
Fusion of SNR-dependent PLDA models for noise robust speaker verification. 619-623 - Maulik C. Madhavi, Hemant A. Patil:
Exploiting Variable length Teager Energy Operator in melcepstral features for person recognition from humming. 624-628 - Ashish Panda:
Psychoacoustic model compensation with robust feature set for speaker verification in additive noise. 629-632 - Chiu-yu Tseng, Chao-yu Su:
Where and how to make an emphasis? - L2 distinct prosody and why. 633-637
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.