{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,9,10]],"date-time":"2024-09-10T08:27:48Z","timestamp":1725956868011},"reference-count":49,"publisher":"MIT Press - Journals","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Transactions of the Association for Computational Linguistics"],"published-print":{"date-parts":[[2020,12]]},"abstract":" Recent work has shown that pre-trained language models such as BERT improve robustness to spurious correlations in the dataset. Intrigued by these results, we find that the key to their success is generalization from a small amount of counterexamples where the spurious correlations do not hold. When such minority examples are scarce, pre-trained models perform as poorly as models trained from scratch. In the case of extreme minority, we propose to use multi-task learning (MTL) to improve generalization. Our experiments on natural language inference and paraphrase identification show that MTL with the right auxiliary tasks significantly improves performance on challenging examples without hurting the in-distribution performance. Further, we show that the gain from MTL mainly comes from improved generalization from the minority examples. Our results highlight the importance of data diversity for overcoming spurious correlations. 1<\/jats:sup> <\/jats:p>","DOI":"10.1162\/tacl_a_00335","type":"journal-article","created":{"date-parts":[[2020,10,15]],"date-time":"2020-10-15T16:13:15Z","timestamp":1602778395000},"page":"621-633","source":"Crossref","is-referenced-by-count":27,"title":["An Empirical Study on Robustness to Spurious Correlations using Pre-trained Language Models"],"prefix":"10.1162","volume":"8","author":[{"given":"Lifu","family":"Tu","sequence":"first","affiliation":[{"name":"Toyota Technological Institute at Chicago."}]},{"given":"Garima","family":"Lalwani","sequence":"additional","affiliation":[{"name":"Amazon AI."}]},{"given":"Spandana","family":"Gella","sequence":"additional","affiliation":[{"name":"Amazon AI."}]},{"given":"He","family":"He","sequence":"additional","affiliation":[{"name":"New York University."}]}],"member":"281","reference":[{"key":"bib1","volume-title":"Association for Computational Linguistics (ACL)","author":"Akula Arjun R.","year":"2020"},{"key":"bib2","volume-title":"Association for Computational Linguistics (ACL)","author":"Andreas J.","year":"2020"},{"key":"bib3","author":"Arjovsky M.","year":"2019","journal-title":"arXiv preprint arXiv:1907.02893v2"},{"key":"bib4","doi-asserted-by":"crossref","first-page":"149","DOI":"10.1613\/jair.731","volume":"12","author":"Baxter J.","year":"2000","journal-title":"Journal of Artificial Intelligence Research (JAIR)"},{"key":"bib5","volume-title":"Empirical Methods in Natural Language Processing (EMNLP)","author":"Bowman S.","year":"2015"},{"key":"bib6","volume-title":"Advances in Neural Information Processing Systems (NeurIPS)","author":"Carmon Y.","year":"2019"},{"issue":"1","key":"bib7","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1023\/A:1007379606734","volume":"28","author":"Caruana Rich","year":"1997","journal-title":"Machine Learning"},{"key":"bib8","volume-title":"Proceedings of the Eleventh International Workshop on Semantic Evaluations","author":"Cer D.","year":"2017"},{"key":"bib9","volume-title":"Empirical Methods in Natural Language Processing (EMNLP)","author":"Clark C.","year":"2019"},{"key":"bib10","volume-title":"Annual Meeting of the Cognitive Science Society, CogSci 2018","author":"Dasgupta Ishita","year":"2018"},{"key":"bib11","volume-title":"North American Association for Computational Linguistics (NAACL)","author":"Devlin J.","year":"2019"},{"key":"bib12","volume-title":"Proceedings of the International Workshop on Paraphrasing","author":"Dolan W. B.","year":"2005"},{"key":"bib13","volume-title":"Association for Computational Linguistics (ACL)","author":"Glockner M.","year":"2018"},{"key":"bib14","volume-title":"International Conference on Machine Learning (ICML)","author":"Goyal Y.","year":"2019"},{"key":"bib15","volume-title":"International Conference on Machine Learning (ICML)","author":"Guo C.","year":"2017"},{"key":"bib16","first-page":"1","volume":"21","author":"Guo J.","year":"2020","journal-title":"Journal of Machine Learning Research (JMLR)"},{"key":"bib17","volume-title":"North American Association for Computational Linguistics (NAACL)","author":"Gururangan S.","year":"2018"},{"key":"bib18","volume-title":"Empirical Methods in Natural Language Processing (EMNLP)","author":"Hashimoto K.","year":"2017"},{"key":"bib19","volume-title":"Proceedings of the EMNLP Workshop on Deep Learning for Low-Resource NLP","author":"He H.","year":"2019"},{"key":"bib20","volume-title":"International Conference on Machine Learning (ICML)","author":"Hendrycks D.","year":"2019"},{"key":"bib21","volume-title":"Association for Computational Linguistics (ACL)","author":"Hendrycks D.","year":"2020"},{"key":"bib22","author":"Iyer S.","year":"2017","journal-title":"Accessed online at"},{"key":"bib23","volume-title":"Association for Computational Linguistics (ACL)","author":"Jha R.","year":"2020"},{"key":"bib24","volume-title":"Association for Computational Linguistics (ACL)","author":"Jia R.","year":"2016"},{"key":"bib25","volume-title":"International Conference on Learning Representations (ICLR)","author":"Kaushik D.","year":"2020"},{"key":"bib26","first-page":"212","volume-title":"Proceedings of the 2nd Workshop on Machine Reading for Question Answering","author":"Li Hongyu","year":"2019"},{"key":"bib27","first-page":"1957","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Liu Miaofeng","year":"2019"},{"key":"bib28","volume-title":"North American Association for Computational Linguistics (NAACL)","author":"Liu N. F.","year":"2019"},{"key":"bib29","volume-title":"Association for Computational Linguistics (ACL)","author":"Liu Xiaodong","year":"2019"},{"key":"bib30","author":"Liu Y.","year":"2019","journal-title":"arXiv preprint arXiv:1907.11692"},{"key":"bib31","volume-title":"Association for Computational Linguistics (ACL)","author":"Mahabadi R. K.","year":"2020"},{"key":"bib32","volume-title":"Association for Computational Linguistics (ACL) System Demonstrations","author":"Manning Christopher D.","year":"2014"},{"key":"bib33","first-page":"1","volume":"17","author":"Maurer A.","year":"2016","journal-title":"Journal of Machine Learning Research (JMLR)"},{"key":"bib34","author":"McCoy R. T.","year":"2019","journal-title":"arXiv preprint arXiv:1902.01007"},{"key":"bib35","volume-title":"Association for Computational Linguistics (ACL)","author":"Min J.","year":"2020"},{"key":"bib36","first-page":"pages 6867\u20136874","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","volume":"33","author":"Nie Yixin","year":"2019"},{"key":"bib37","volume-title":"Advances in Neural Information Processing Systems (NeurIPS)","author":"Nye M. I.","year":"2019"},{"key":"bib38","volume-title":"Empirical Methods in Natural Language Processing (EMNLP)","author":"Oren Y.","year":"2019"},{"key":"bib39","author":"Raffel C.","year":"2019","journal-title":"arXiv preprint arXiv:1910.10683"},{"key":"bib40","author":"Ruder Sebastian","year":"2017","journal-title":"arXiv preprint arXiv:1707.05246"},{"key":"bib41","author":"Ruder S.","year":"2017","journal-title":"arXiv preprint arXiv:1706.05098"},{"key":"bib42","volume-title":"International Conference on Learning Representations (ICLR)","author":"Sagawa S.","year":"2020"},{"key":"bib43","first-page":"5014","volume-title":"Advances in Neural Information Processing Systems (NeurIPS)","author":"Schmidt L.","year":"2018"},{"key":"bib44","volume-title":"Association for Computational Linguistics (ACL)","author":"S\u00f8gaard A.","year":"2016"},{"key":"bib45","author":"Williams A.","year":"2017","journal-title":"arXiv preprint arXiv:1704.05426"},{"key":"bib46","author":"Yaghoobzadeh Yadollah","year":"2019","journal-title":"CoRR"},{"key":"bib47","author":"Zhang T.","year":"2020","journal-title":"arXiv preprint arXiv:2006. 05987"},{"key":"bib48","volume-title":"North American Association for Computational Linguistics (NAACL)","author":"Zhang Y.","year":"2019"},{"key":"bib49","volume-title":"Association for Computational Linguistics (ACL)","author":"Zhou X.","year":"2020"}],"container-title":["Transactions of the Association for Computational Linguistics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mitpressjournals.org\/doi\/pdf\/10.1162\/tacl_a_00335","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,3,12]],"date-time":"2021-03-12T21:39:44Z","timestamp":1615585184000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/tacl\/article\/96483"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,12]]},"references-count":49,"alternative-id":["10.1162\/tacl_a_00335"],"URL":"https:\/\/doi.org\/10.1162\/tacl_a_00335","relation":{},"ISSN":["2307-387X"],"issn-type":[{"value":"2307-387X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,12]]}}}