An Empirical Study on Language Model Adaptation Using a Metric of Domain Similarity

Yuan, Wei; Gao, Jianfeng; Suzuki, Hisami

doi:10.1007/11562214_83

Wei Yuan²²,
Jianfeng Gao²³ &
Hisami Suzuki²⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3651))

Included in the following conference series:

International Conference on Natural Language Processing

1597 Accesses
1 Citations

Abstract

This paper presents an empirical study on four techniques of language model adaptation, including a maximum a posteriori (MAP) method and three discriminative training models, in the application of Japanese Kana-Kanji conversion. We compare the performance of these methods from various angles by adapting the baseline model to four adaptation domains. In particular, we attempt to interpret the results given in terms of the character error rate (CER) by correlating them with the characteristics of the adaptation domain measured using the information-theoretic notion of cross entropy. We show that such a metric correlates well with the CER performance of the adaptation methods, and also show that the discriminative methods are not only superior to a MAP-based method in terms of achieving larger CER reduction, but are also more robust against the similarity of background and adaptation domains.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Parameter-efficient fine-tuning of large-scale pre-trained language models

Article Open access 02 March 2023

A survey of domain adaptation for statistical machine translation

Article 01 December 2017

A Decade of Discriminative Language Modeling for Automatic Speech Recognition

References

Bellagarda, J.: An Overview of Statistical Language Model Adaptation. In: ITRW on Adaptation Methods for Speech Recognition, pp. 165–174 (2001)
Google Scholar
Collins, M.: Ranking Algorithms for Name-Entity Extraction: Boosting and the Voted Perceptron. In: ACL (2002)
Google Scholar
Collins, M.: Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms. In: EMNLP (2002)
Google Scholar
Gao, J., Yu, H., Xu, P., Yuan, W.: Minimum Sample Risk Methods for Language Modeling (2005) (to Appear)
Google Scholar
Manning, C.D., Schütze, H.: Foundations of Statistical Natural Language Processing. The MIT Press, Cambridge (1999)
MATH Google Scholar
Dagan, I., Lee, L., Pereira, F.: Similarity-based models of cooccurrence probabilities. Machine Learning 34(1-3), 43–69 (1999)
Article MATH Google Scholar
Lee, L.: Measures of distributional similarity. In: ACL, pp. 25-32 (1999)
Google Scholar
Roark, B., Saraclar, M., Collins, M.: Corrective Language Modeling for Large Vocabulary ASR with the Perceptron Algorithm. In: ICASSP, pp. 749–752 (2004)
Google Scholar
Bacchiani, M., Roark, B., Saraclar, M.: Language Model Adaptation with MAP Estimation and the Perceptron Algorithm. In: HLT-NAACL, pp. 21–24 (2004)
Google Scholar
Bacchiani, M., Roark, B.: Unsupervised language model adaptation. In: ICASSP, pp. 224–227 (2003)
Google Scholar
Och, F.J.: Minimum error rate training in statistical machine translation. In: ACL, pp. 160–167 (2003)
Google Scholar
Gao, J., Goodman, J., Li, M., Lee, K.F.: Toward a unified approach to statistical language modeling for Chinese. ACM Transactions on Asian Language Information Processing l-1, 3–33 (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Shanghai Jiao Tong University, 1954 Huashan Road, Shanghai, 200030
Wei Yuan
Microsoft Research, Asia, 49 Zhichun Road, Haidian District, Beijing, 100080
Jianfeng Gao
Microsoft Research, One Microsoft Way, Redmond, WA, 98052
Hisami Suzuki

Authors

Wei Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Jianfeng Gao
View author publications
You can also search for this author in PubMed Google Scholar
Hisami Suzuki
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Center for Language Technology, Macquarie University, 2019, Sydney, NSW, Australia
Robert Dale
Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong
Kam-Fai Wong
Institute for Infocomm Research, 21, Heng Mui Keng Terrace, 119613, Singapore
Jian Su
Language Information Sciences Research Centre, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong
Oi Yee Kwong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yuan, W., Gao, J., Suzuki, H. (2005). An Empirical Study on Language Model Adaptation Using a Metric of Domain Similarity. In: Dale, R., Wong, KF., Su, J., Kwong, O.Y. (eds) Natural Language Processing – IJCNLP 2005. IJCNLP 2005. Lecture Notes in Computer Science(), vol 3651. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11562214_83

Download citation

DOI: https://doi.org/10.1007/11562214_83
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29172-5
Online ISBN: 978-3-540-31724-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

An Empirical Study on Language Model Adaptation Using a Metric of Domain Similarity

Abstract

Access this chapter

Preview

Similar content being viewed by others

Parameter-efficient fine-tuning of large-scale pre-trained language models

A survey of domain adaptation for statistical machine translation

A Decade of Discriminative Language Modeling for Automatic Speech Recognition

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

An Empirical Study on Language Model Adaptation Using a Metric of Domain Similarity

Abstract

Access this chapter

Preview

Similar content being viewed by others

Parameter-efficient fine-tuning of large-scale pre-trained language models

A survey of domain adaptation for statistical machine translation

A Decade of Discriminative Language Modeling for Automatic Speech Recognition

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation