计算机科学 ›› 2016, Vol. 43 ›› Issue (9): 103-106.doi: 10.11896/j.issn.1002-137X.2016.09.019
• 2015 年第三届CCF 大数据学术会议 • 上一篇 下一篇
杨蓓,周兰江,余正涛,刘丽佳
YANG Bei, ZHOU Lan-jiang, YU Zheng-tao and LIU Li-jia
摘要: 针对老挝语语料资源极少而无法直接利用有监督学习的方法实现老挝语词法分析的问题,提出了基于半监督学习的老挝语词性标注方法。首先利用仅有的少量标注词典和未标注语料资源,采用简单概率模型建模,获取较为完整的标注词典;其次利用整数规划获取大量自动标注的语料;最后在训练语 料充足的情况下,利用二阶隐马尔科夫模型建模,实现高质量的老挝语词性标注。提出的方法在老挝语词性标注方面取得了较好的效果,其准确率达到89.8%。
[1] Hong Ming-cai,Zhang Kuo,Tang Jie,et al.A Chinese Part-of-Speech Tagging method based on conditional random fields(CRFs)[J].Computer Science,2006,3(10):148-155(in Chinese) 洪铭材,张阔,唐杰,等.基于条件随机场(CRFs)的中文词性标注方法[J].计算机科学,2006,3(10):148-155 [2] Dan G,Baldridge J.Type-Supervised Hidden Markov Models forPart-of-Speech Tagging with Incomplete Tag Dictionaries[C]∥Proceedings of the Association for Computational Linguistics(ACL).2012:821-831 [3] Wang Li-jie,Che Wan-xiang,Liu Ting.Chinese Part-of-Speech Tagging Based on SVMTool[J].Journal of Chinese Information Processing,2009,23(4):16-21(in Chinese) 王丽杰,车万翔,刘挺.基于SVMTool的中文词性标注[J].中文信息学报,2009,23(4):16-21 [4] Merialdo B.Tagging english text with a probabilistic model[J].Computational Linguistics,2002,20(2):155-171 [5] Garrette,Baldridge J.Learning a Part-of-Speech Tagger fromTwo Hours of Annotation[C]∥Proceedings of the Association for Computational Linguistics(ACL).2013:138-147 [6] Toutanova K, Johnson M.A Bayesian LDA-based model forsemi-supervised part-of-speech tagging [C]∥Proceedings of The Annual Conference on Neural Information Processing Systems(NIPS).2008:1521-1528 [7] Ravi S,Knight K.Minimized Models for Unsupervised Part-of-Speech Tagging[C]∥Proceedings of the Association for Computational Linguistics(ACL).2009 [8] Liang Yi-min,Huang De-gen.Full second-order Hidden Markov model based Part-of-Speech Tagging[J].Computer Enginee-ring,2005,1(10):177-180(in Chinese) 梁以敏,黄德根.基于完全二阶隐马尔可夫模型的词性标注[J].计算机工程,2005,1(10):177-180 [9] Liu Jie-bin, Song Mao-qiang, Zhao Fang,et al.Context basedsecond-order Hidden Markov model[J].Computer Engineering,2010,6(10):231-235(in Chinese) 刘洁彬,宋茂强,赵方等.基于上下文的二阶隐马尔可夫模型[J].计算机工程,2010,6(10):231-235 [10] Feng Yue-jiao,He Xing-shi.Theory and Implementation of se-cond-order Hidden Markov model[J].Value Engineering,2009(12):103-105(in Chinese) 丰月姣,贺兴时.二阶隐马尔科夫模型的原理与实现[J].价值工程,2009(12):103-105 [11] Thede S M,Harper M P.A second-order Hidden Markov Model forpart-of-speech tagging[C]∥Proceedings of the Association for Computational Linguistics(ACL).1999:20-26 [12] Yang Hong,Wng Sigerileng.HMM based Automatic Mongolia Part-of-Speech Tagging[J].Journal of Inner Mongolia Normal University(Natural Science Edition),2010,39(2):206-209(in Chinese) 艳红,王斯日古楞.基于HMM 的蒙古文自动词性标注研究[J].内蒙古师范大学学报(自然科学汉文版),2010,39(2):206-209 |
No related articles found! |
|