Light Pre-Trained Chinese Language Model for NLP Tasks

Li, Junyi; Hu, Hai; Zhang, Xuanwei; Li, Minglei; Li, Lu; Xu, Liang

doi:10.1007/978-3-030-60457-8_47

Junyi Li¹²,
Hai Hu^12,13,
Xuanwei Zhang^12,15,
Minglei Li^12,16,
Lu Li^12,14 &
…
Liang Xu¹²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12431))

Included in the following conference series:

CCF International Conference on Natural Language Processing and Chinese Computing

Abstract

We present the results of shared-task 1 held in the 2020 Conference on Natural Language Processing and Chinese Computing (NLPCC): Light Pre-Trained Chinese Language Model for NLP tasks. This shared-task examines the performance of light language models on four common NLP tasks: Text Classification, Named Entity Recognition, Anaphora Resolution and Machine Reading Comprehension. To make sure that the models are light-weight, we put restrictions and requirements on the number of parameters and inference speed of the participating models. In total, 30 teams registered our tasks. Each submission was evaluated through our online benchmark system (https://www.cluebenchmarks.com/nlpcc2020.html), with the average score over the four tasks as the final score. Various ideas and frameworks were explored by the participants, including data enhancement, knowledge distillation and quantization. The best model achieved an average score of 75.949, which was very close to BERT-base (76.460). We believe this shared-task highlights the potential of light-weight models and calls for further research on the development and exploration of light-weight models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 5719; Price includes VAT (Japan)

Softcover Book: JPY 7149; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

PuoBERTa: Training and Evaluation of a Curated Language Model for Setswana

Pre-trained models for natural language processing: A survey

Article 15 September 2020

A Robustly Optimized BERT Pre-training Approach with Post-training

Notes

1.
http://ai.chuangxin.com/.
2.
https://tianchi.aliyun.com/competition/gameList/activeList.
3.
https://www.kesci.com/.
4.
https://ai.baidu.com/.
5.
http://tcci.ccf.org.cn/conference/2020/cfpt.php.
6.
See most recent results at https://www.cluebenchmarks.com/nlpcc2020.html.
7.
https://github.com/CLUEbenchmark/LightLM.
8.
https://www.cluebenchmarks.com/nlpcc2020.html.
9.
https://github.com/CLUEbenchmark/LightLM/tree/master/baselines.
10.
Recently, 69.289 is the best score of our baseline.
11.
Huawei Cloud & Noah’s Ark lab submitted Rank 3 instead of the best one.
12.
Thanks to Xiaomi AI Lab. They submitted this BERT-base model, which is though not totally fine-tuned.

References

Chen, B., Huang, F.: Semi-supervised convolutional networks for translation adaptation with tiny amount of in-domain data. In: Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning, pp. 314–323 (2016)
Google Scholar
Cui, Y., et al.: A span-extraction dataset for Chinese machine reading comprehension. arXiv preprint arXiv:1810.07366 (2018)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Gordon, M.A., Duh, K., Andrews, N.: Compressing BERT: studying the effects of weight pruning on transfer learning. arXiv preprint arXiv:2002.08307 (2020)
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)
Jiao, X., et al.: TinyBERT: Distilling BERT for natural language understanding. arXiv preprint arXiv:1909.10351 (2019)
Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: ALBERT: a lite BERT for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019)
Levesque, H., Davis, E., Morgenstern, L.: The winograd schema challenge. In: Thirteenth International Conference on the Principles of Knowledge Representation and Reasoning (2012)
Google Scholar
Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
Sanh, V., Debut, L., Chaumond, J., Wolf, T.: DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019)
Sun, Y., et al.: Ernie: Enhanced representation through knowledge integration. arXiv preprint arXiv:1904.09223 (2019)
Sun, Z., Yu, H., Song, X., Liu, R., Yang, Y., Zhou, D.: MobileBERT: a compact task-agnostic BERT for resource-limited devices. arXiv preprint arXiv:2004.02984 (2020)
Wang, A., et al.: SuperGLUE: a stickier benchmark for general-purpose language understanding systems. arXiv e-prints (2019)
Google Scholar
Wang, W., Wei, F., Dong, L., Bao, H., Yang, N., Zhou, M.: MiniLM: deep self-attention distillation for task-agnostic compression of pre-trained transformers. arXiv preprint arXiv:2002.10957 (2020)
Wei, J., et al.: NEZHA: neural contextualized representation for Chinese language understanding. arXiv preprint arXiv:1909.00204 (2019)
Xu, L., et al.: CLUENER 2020: fine-grained named entity recognition dataset and benchmark for Chinese. arXiv preprint arXiv-2001 (2020)
Google Scholar
Xu, L., Zhang, X., Dong, Q.: CLUECorpus 2020: a large-scale Chinese corpus for pre-traininglanguage model. arXiv preprint arXiv:2003.01355 (2020)
Xu, L., Zhang, X., Li, L., Hu, H., Cao, C., Liu, W., Li, J., Li, Y., Sun, K., Xu, Y., et al.: Clue: A chinese language understanding evaluation benchmark. arXiv preprint arXiv:2004.05986 (2020)
Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R.R., Le, Q.V.: XLNet: generalized autoregressive pretraining for language understanding. In: Advances in Neural Information Processing Systems, pp. 5753–5763 (2019)
Google Scholar
Zhao, Z., et al.: UER: an open-source toolkit for pre-training models. arXiv preprint arXiv:1909.05658 (2019)

Download references

Acknowledge

Many thanks to NLPCC for giving us this opportunity to organize this task and people who take part in this task.

Author information

Authors and Affiliations

CLUE Team, Shenzhen, China
Junyi Li, Hai Hu, Xuanwei Zhang, Minglei Li, Lu Li & Liang Xu
Indiana University, Bloomington, USA
Hai Hu
Central China Normal University, Wuhan, China
Lu Li
iQIYI Inc., Beijing, China
Xuanwei Zhang
Speech and Language Innovation Lab, Huawei Cloud and AI, Shenzhen, China
Minglei Li

Authors

Junyi Li
View author publications
You can also search for this author in PubMed Google Scholar
Hai Hu
View author publications
You can also search for this author in PubMed Google Scholar
Xuanwei Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Minglei Li
View author publications
You can also search for this author in PubMed Google Scholar
Lu Li
View author publications
You can also search for this author in PubMed Google Scholar
Liang Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Junyi Li .

Editor information

Editors and Affiliations

ECE & Ingenuity Labs Research Institute, Queen’s University, Kingston, ON, Canada
Xiaodan Zhu
Department of Computer Science and Technology, Tsinghua University, Beijing, China
Min Zhang
School of Computer Science and Technology, Soochow University, Suzhou, China
Yu Hong
College of Intelligence and Computing, Tianjin University, Tianjin, China
Ruifang He

A Appendix

1.1 A.1 List of Literary Works Selected in CLUEWSC2020

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, J., Hu, H., Zhang, X., Li, M., Li, L., Xu, L. (2020). Light Pre-Trained Chinese Language Model for NLP Tasks. In: Zhu, X., Zhang, M., Hong, Y., He, R. (eds) Natural Language Processing and Chinese Computing. NLPCC 2020. Lecture Notes in Computer Science(), vol 12431. Springer, Cham. https://doi.org/10.1007/978-3-030-60457-8_47

Download citation

DOI: https://doi.org/10.1007/978-3-030-60457-8_47
Published: 02 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60456-1
Online ISBN: 978-3-030-60457-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)

Light Pre-Trained Chinese Language Model for NLP Tasks

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

PuoBERTa: Training and Evaluation of a Curated Language Model for Setswana

Pre-trained models for natural language processing: A survey

A Robustly Optimized BERT Pre-training Approach with Post-training

Notes

References

Acknowledge

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Appendix

1.1 A.1 List of Literary Works Selected in CLUEWSC2020

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Subscribe and save

Buy Now

Navigation

Light Pre-Trained Chinese Language Model for NLP Tasks

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

PuoBERTa: Training and Evaluation of a Curated Language Model for Setswana

Pre-trained models for natural language processing: A survey

A Robustly Optimized BERT Pre-training Approach with Post-training

Notes

References

Acknowledge

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Appendix

A Appendix

1.1 A.1 List of Literary Works Selected in CLUEWSC2020

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation