{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,4,18]],"date-time":"2025-04-18T21:04:09Z","timestamp":1745010249373,"version":"3.37.3"},"reference-count":45,"publisher":"Association for Computing Machinery (ACM)","issue":"5","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Asian Low-Resour. Lang. Inf. Process."],"published-print":{"date-parts":[[2021,9,30]]},"abstract":"Sentiment analysis on social media relies on comprehending the natural language and using a robust machine learning technique that learns multiple layers of representations or features of the data and produces state-of-the-art prediction results. The cultural miscellanies, geographically limited trending topic hash-tags, access to aboriginal language keyboards, and conversational comfort in native language compound the linguistic challenges of sentiment analysis. This research evaluates the performance of cross-lingual contextual word embeddings and zero-shot transfer learning in projecting predictions from resource-rich English to resource-poor Hindi language. The cross-lingual XLM-RoBERTa classification model is trained and fine-tuned using the English language Benchmark SemEval 2017 dataset Task 4 A and subsequently zero-shot transfer learning is used to evaluate the classification model on two Hindi sentence-level sentiment analysis datasets, namely, IITP-Movie and IITP-Product review datasets. The proposed model compares favorably to state-of-the-art approaches and gives an effective solution to sentence-level (tweet-level) analysis of sentiments in a resource-poor scenario. The proposed model compares favorably to state-of-the-art approaches and achieves an average performance accuracy of 60.93 on both the Hindi datasets.<\/jats:p>","DOI":"10.1145\/3461764","type":"journal-article","created":{"date-parts":[[2021,6,30]],"date-time":"2021-06-30T20:06:29Z","timestamp":1625083589000},"page":"1-13","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":31,"title":["Sentiment Analysis Using XLM-R Transformer and Zero-shot Transfer Learning on Resource-poor Indian Language"],"prefix":"10.1145","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4263-7168","authenticated-orcid":false,"given":"Akshi","family":"Kumar","sequence":"first","affiliation":[{"name":"Department of Computer Science & Engineering, Delhi Technological University, New Delhi, India"}]},{"given":"Victor Hugo C.","family":"Albuquerque","sequence":"additional","affiliation":[{"name":"Laboratory of Industrial Informatics, Electronics and Health, University of Fortaleza (UNIFOR), Cear\u00e1, Brazil"}]}],"member":"320","published-online":{"date-parts":[[2021,6,30]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.2200\/S00416ED1V01Y201204HLT016"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2019.102141"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/TII.2020.3005532"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/3390298"},{"key":"e_1_2_1_5_1","volume-title":"Meld: A multimodal multi-party dataset for emotion recognition in conversations.","author":"Poria Soujanya","year":"2018","unstructured":"Soujanya Poria , Devamanyu Hazarika , Navonil Majumder , Gautam Naik , Erik Cambria , and Rada Mihalcea . 2018 . Meld: A multimodal multi-party dataset for emotion recognition in conversations. Retrieved from https:\/\/arXiv:1810.02508. Soujanya Poria, Devamanyu Hazarika, Navonil Majumder, Gautam Naik, Erik Cambria, and Rada Mihalcea. 2018. Meld: A multimodal multi-party dataset for emotion recognition in conversations. Retrieved from https:\/\/arXiv:1810.02508."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/MCI.2019.2954667"},{"key":"e_1_2_1_7_1","volume-title":"Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval\u201916)","author":"Pontiki Maria","year":"2016","unstructured":"Maria Pontiki , Dimitrios Galanis , Haris Papageorgiou , Ion Androutsopoulos , Suresh Manandhar , Mohammad Al-Smadi , Mahmoud Al-Ayyoub et al. 2016. Semeval-2016 task 5: Aspect-based sentiment analysis . In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval\u201916) . 2016 . Maria Pontiki, Dimitrios Galanis, Haris Papageorgiou, Ion Androutsopoulos, Suresh Manandhar, Mohammad Al-Smadi, Mahmoud Al-Ayyoub et al. 2016. Semeval-2016 task 5: Aspect-based sentiment analysis. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval\u201916). 2016."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2019.105010"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/NLPKE.2009.5313734"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2018.07.006"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/3360016"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.eij.2020.04.003"},{"key":"e_1_2_1_13_1","doi-asserted-by":"crossref","unstructured":"Valentin Barriere and Alexandra Balahur. 2020. Improving sentiment analysis over non-english tweets using multilingual transformers and automatic translation for data-augmentation. Retrieved from https:\/\/arXiv:2010.03486. Valentin Barriere and Alexandra Balahur. 2020. Improving sentiment analysis over non-english tweets using multilingual transformers and automatic translation for data-augmentation. Retrieved from https:\/\/arXiv:2010.03486.","DOI":"10.18653\/v1\/2020.coling-main.23"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2020.2976196"},{"key":"e_1_2_1_15_1","volume-title":"Florimond Gu\u00e9niat, and Harish Tayyar Madabushi.","author":"Leon De","year":"2020","unstructured":"De Leon , Frances Adriana Laureano , Florimond Gu\u00e9niat, and Harish Tayyar Madabushi. 2020 . CS-embed-francesita at semeval-2020 Task 9: The effectiveness of code-switched word embeddings for sentiment analysis. Retrieved from https:\/\/arXiv:2006.04597. De Leon, Frances Adriana Laureano, Florimond Gu\u00e9niat, and Harish Tayyar Madabushi. 2020. CS-embed-francesita at semeval-2020 Task 9: The effectiveness of code-switched word embeddings for sentiment analysis. Retrieved from https:\/\/arXiv:2006.04597."},{"key":"e_1_2_1_16_1","unstructured":"Anoop Kunchukuttan Divyanshu Kakwani Satish Golla Avik Bhattacharyya Mitesh M. Khapra and Pratyush Kumar. 2020. AI4Bharat-IndicNLP Corpus: Monolingual corpora and word embeddings for indic languages. Retrieved from https:\/\/arXiv:2005.00085. Anoop Kunchukuttan Divyanshu Kakwani Satish Golla Avik Bhattacharyya Mitesh M. Khapra and Pratyush Kumar. 2020. AI4Bharat-IndicNLP Corpus: Monolingual corpora and word embeddings for indic languages. Retrieved from https:\/\/arXiv:2005.00085."},{"key":"e_1_2_1_17_1","volume-title":"Proceedings of the International Conference on Learning Representations (ICLR\u201920)","author":"Karthikeyan K","year":"2020","unstructured":"K Karthikeyan , Zihan Wang , Stephen Mayhew , and Dan Roth . 2020 . Cross-lingual ability of multilingual BERT: An empirical study . In Proceedings of the International Conference on Learning Representations (ICLR\u201920) . K Karthikeyan, Zihan Wang, Stephen Mayhew, and Dan Roth. 2020. Cross-lingual ability of multilingual BERT: An empirical study. In Proceedings of the International Conference on Learning Representations (ICLR\u201920)."},{"key":"e_1_2_1_18_1","doi-asserted-by":"crossref","unstructured":"Alexis Conneau Kartikay Khandelwal Naman Goyal Vishrav Chaudhary Guillaume Wenzek Francisco Guzm\u00e1n Edouard Grave Myle Ott Luke Zettlemoyer and Veselin Stoyanov. 2019. Unsupervised cross-lingual representation learning at scale. Retrieved from https:\/\/arXiv:1911.02116. Alexis Conneau Kartikay Khandelwal Naman Goyal Vishrav Chaudhary Guillaume Wenzek Francisco Guzm\u00e1n Edouard Grave Myle Ott Luke Zettlemoyer and Veselin Stoyanov. 2019. Unsupervised cross-lingual representation learning at scale. Retrieved from https:\/\/arXiv:1911.02116.","DOI":"10.18653\/v1\/2020.acl-main.747"},{"key":"e_1_2_1_19_1","doi-asserted-by":"crossref","first-page":"15349","DOI":"10.1007\/s11042-019-7346-5","article-title":"Systematic literature review on context-based sentiment analysis in social multimedia","volume":"79","author":"Akshi Kumar","year":"2019","unstructured":"Kumar Akshi and Geetanjali Garg . 2019 . Systematic literature review on context-based sentiment analysis in social multimedia . Multimedia Tools Appl. 79 , 21 (2019), 15349 \u2013 15380 . Kumar Akshi and Geetanjali Garg. 2019. Systematic literature review on context-based sentiment analysis in social multimedia. Multimedia Tools Appl. 79, 21 (2019), 15349\u201315380.","journal-title":"Multimedia Tools Appl."},{"key":"e_1_2_1_20_1","doi-asserted-by":"crossref","unstructured":"Jean-Benoit Delbrouck No\u00e9 Tits Mathilde Brousmiche and St\u00e9phane Dupont. 2020. A transformer-based joint-encoding for emotion recognition and sentiment analysis. Retrieved from https:\/\/arXiv:2006.15955. Jean-Benoit Delbrouck No\u00e9 Tits Mathilde Brousmiche and St\u00e9phane Dupont. 2020. A transformer-based joint-encoding for emotion recognition and sentiment analysis. Retrieved from https:\/\/arXiv:2006.15955.","DOI":"10.18653\/v1\/2020.challengehml-1.1"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.3233\/JIFS-179881"},{"key":"e_1_2_1_22_1","doi-asserted-by":"crossref","unstructured":"Ahmed Sultan Mahmoud Salim Amina Gaber and Islam El Hosary. 2020. WESSA at SemEval-2020 Task 9: Code-mixed sentiment analysis using transformers. Retrieved from https:\/\/arXiv:2009.09879. Ahmed Sultan Mahmoud Salim Amina Gaber and Islam El Hosary. 2020. WESSA at SemEval-2020 Task 9: Code-mixed sentiment analysis using transformers. Retrieved from https:\/\/arXiv:2009.09879.","DOI":"10.18653\/v1\/2020.semeval-1.181"},{"key":"e_1_2_1_23_1","unstructured":"Y Kuratov M. Arkhipov. 2019. Adaptation of deep bidirectional multilingual transformers for Russian language. Retrieved from https:\/\/arXiv:1905.07213. Y Kuratov M. Arkhipov. 2019. Adaptation of deep bidirectional multilingual transformers for Russian language. Retrieved from https:\/\/arXiv:1905.07213."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/3342827.3342850"},{"key":"e_1_2_1_25_1","volume-title":"Proceedings of the Fourteenth Workshop on Semantic Evaluation. 1276\u20131280","author":"Garain Avishek","year":"2020","unstructured":"Avishek Garain , Sainik Kumar Mahata , and Dipankar Das . 2020 . JUNLP@ SemEval-2020 Task 9: Sentiment analysis of Hindi-English code mixed data using grid search cross validation . In Proceedings of the Fourteenth Workshop on Semantic Evaluation. 1276\u20131280 . https:\/\/arxiv.org\/abs\/2007.12561. Avishek Garain, Sainik Kumar Mahata, and Dipankar Das. 2020. JUNLP@ SemEval-2020 Task 9: Sentiment analysis of Hindi-English code mixed data using grid search cross validation. In Proceedings of the Fourteenth Workshop on Semantic Evaluation. 1276\u20131280. https:\/\/arxiv.org\/abs\/2007.12561."},{"key":"e_1_2_1_26_1","doi-asserted-by":"crossref","unstructured":"Somnath Banerjee Sahar Ghannay Sophie Rosset Anne Vilnat and Paolo Rosso. 2020. LIMSI_UPV at SemEval-2020 Task 9: Recurrent convolutional neural network for code-mixed sentiment analysis. Retrieved from https:\/\/arXiv:2008.13173. Somnath Banerjee Sahar Ghannay Sophie Rosset Anne Vilnat and Paolo Rosso. 2020. LIMSI_UPV at SemEval-2020 Task 9: Recurrent convolutional neural network for code-mixed sentiment analysis. Retrieved from https:\/\/arXiv:2008.13173.","DOI":"10.18653\/v1\/2020.semeval-1.172"},{"key":"e_1_2_1_27_1","doi-asserted-by":"crossref","unstructured":"Parth Patwa Gustavo Aguilar Sudipta Kar Suraj Pandey Srinivas PYKL Bj\u00f6rn Gamb\u00e4ck Tanmoy Chakraborty Thamar Solorio and Amitava Das. 2008. Semeval-2020 task 9: Overview of sentiment analysis of code-mixed tweets. Retrieved from https:\/\/arxiv.org\/abs\/2008.04277. Parth Patwa Gustavo Aguilar Sudipta Kar Suraj Pandey Srinivas PYKL Bj\u00f6rn Gamb\u00e4ck Tanmoy Chakraborty Thamar Solorio and Amitava Das. 2008. Semeval-2020 task 9: Overview of sentiment analysis of code-mixed tweets. Retrieved from https:\/\/arxiv.org\/abs\/2008.04277.","DOI":"10.18653\/v1\/2020.semeval-1.100"},{"key":"e_1_2_1_28_1","volume-title":"Rajiv Ratn Shah","author":"Kumar Yaman","year":"2019","unstructured":"Yaman Kumar , Debanjan Mahata , Sagar Aggarwal , Anmol Chugh , Rajat Maheshwari , Rajiv Ratn Shah . 2019 . BHAAV\u2014A text corpus for emotion analysis from Hindi stories. Retrieved from https:\/\/arXiv:1910.04073. Yaman Kumar, Debanjan Mahata, Sagar Aggarwal, Anmol Chugh, Rajat Maheshwari, Rajiv Ratn Shah. 2019. BHAAV\u2014A text corpus for emotion analysis from Hindi stories. Retrieved from https:\/\/arXiv:1910.04073."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/3383330"},{"key":"e_1_2_1_30_1","unstructured":"A. Bakliwal P. Arora and V. Varma. 2012. Hindi subjective lexicon: A lexical resource for Hindi polarity classification. Int. J. Comput. Linguist. Appl. (IJCLA) 2012 A. Bakliwal P. Arora and V. Varma. 2012. Hindi subjective lexicon: A lexical resource for Hindi polarity classification. Int. J. Comput. Linguist. Appl. (IJCLA) 2012"},{"volume-title":"Proceedings of the International Conference on Computational Linguistics (COLING\u201912)","author":"Balamurali A","key":"e_1_2_1_31_1","unstructured":"A Balamurali , R. Joshi , A, and P. Bhattacharyya . 2012. Cross-lingual sentiment analysis for Indian languages using linked wordnets . In Proceedings of the International Conference on Computational Linguistics (COLING\u201912) . A Balamurali, R. Joshi, A, and P. Bhattacharyya. 2012. Cross-lingual sentiment analysis for Indian languages using linked wordnets. In Proceedings of the International Conference on Computational Linguistics (COLING\u201912)."},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-26832-3_61"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-26832-3_67"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-26832-3_65"},{"volume-title":"Proceedings of the International Conference on Language Resources and Evaluation (LREC\u201916)","author":"Akhtar M. S.","key":"e_1_2_1_35_1","unstructured":"M. S. Akhtar , A. Ekbal , and P. Bhattacharyya . 2016. Aspect-based sentiment analysis in Hindi: Resource creation and sentiment classification . In Proceedings of the International Conference on Language Resources and Evaluation (LREC\u201916) . M. S. Akhtar, A. Ekbal, and P. Bhattacharyya. 2016. Aspect-based sentiment analysis in Hindi: Resource creation and sentiment classification. In Proceedings of the International Conference on Language Resources and Evaluation (LREC\u201916)."},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N18-1053"},{"volume-title":"Proceedings of the NAACL Workshop on Vector Space Modeling.","author":"Luong Minh-Thang","key":"e_1_2_1_37_1","unstructured":"Minh-Thang Luong , Hieu Pham , and Christopher D. Manning . 2015, Bilingual word representations with monolingual quality in mind . In Proceedings of the NAACL Workshop on Vector Space Modeling. Minh-Thang Luong, Hieu Pham, and Christopher D. Manning. 2015, Bilingual word representations with monolingual quality in mind. In Proceedings of the NAACL Workshop on Vector Space Modeling."},{"volume-title":"Proceedings of the 26th International Conference on Computational Linguistics (COLING\u201916)","author":"Akhtar M. S.","key":"e_1_2_1_38_1","unstructured":"M. S. Akhtar , A. Kumar , A. Ekbal , and P. Bhattacharyya . 2016. A hybrid deep learning architecture for sentiment analysis . In Proceedings of the 26th International Conference on Computational Linguistics (COLING\u201916) . 482\u2013493. M. S. Akhtar, A. Kumar, A. Ekbal, and P. Bhattacharyya. 2016. A hybrid deep learning architecture for sentiment analysis. In Proceedings of the 26th International Conference on Computational Linguistics (COLING\u201916). 482\u2013493."},{"key":"e_1_2_1_39_1","doi-asserted-by":"crossref","unstructured":"Chi Sun Xipeng Qiu Yige Xu and Xuanjing Huang. 2019. How to fine-tune BERT for text classification? In Chinese Computational Linguistics Maosong Sun Xuanjing Huang Heng Ji Zhiyuan Liu and Yang Liu (Eds.). 194\u2013206 Chi Sun Xipeng Qiu Yige Xu and Xuanjing Huang. 2019. How to fine-tune BERT for text classification? In Chinese Computational Linguistics Maosong Sun Xuanjing Huang Heng Ji Zhiyuan Liu and Yang Liu (Eds.). 194\u2013206","DOI":"10.1007\/978-3-030-32381-3_16"},{"key":"e_1_2_1_40_1","doi-asserted-by":"crossref","unstructured":"Anne Lauscher Vinit Ravishankar Ivan Vuli\u0107 and Goran Glava\u0161. 2020. From zero to hero: On the limitations of zero-shot cross-lingual transfer with multilingual transformers. Retrieved from https:\/\/arXiv:2005.00633. Anne Lauscher Vinit Ravishankar Ivan Vuli\u0107 and Goran Glava\u0161. 2020. From zero to hero: On the limitations of zero-shot cross-lingual transfer with multilingual transformers. Retrieved from https:\/\/arXiv:2005.00633.","DOI":"10.18653\/v1\/2020.emnlp-main.363"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1186\/s40537-019-0278-0"},{"key":"e_1_2_1_42_1","doi-asserted-by":"crossref","unstructured":"Sultan Ahmed Mahmoud Salim Amina Gaber and Islam El Hosary. 2020. WESSA at SemEval-2020 Task 9: Code-mixed sentiment analysis using transformers. Retrieved from https:\/\/arXiv:2009.09879. Sultan Ahmed Mahmoud Salim Amina Gaber and Islam El Hosary. 2020. WESSA at SemEval-2020 Task 9: Code-mixed sentiment analysis using transformers. Retrieved from https:\/\/arXiv:2009.09879.","DOI":"10.18653\/v1\/2020.semeval-1.181"},{"key":"e_1_2_1_43_1","doi-asserted-by":"crossref","unstructured":"Dat Quoc Nguyen Thanh Vu and Anh Tuan Nguyen. 2020. BERTweet: A pre-trained language model for English tweets. Retrieved from https:\/\/arXiv:2005.10200. Dat Quoc Nguyen Thanh Vu and Anh Tuan Nguyen. 2020. BERTweet: A pre-trained language model for English tweets. Retrieved from https:\/\/arXiv:2005.10200.","DOI":"10.18653\/v1\/2020.emnlp-demos.2"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/S17-2094"},{"volume-title":"Proceedings of the 11th International Workshop on Semantic Evaluations (SemEval\u201917)","author":"Baziotis C.","key":"e_1_2_1_45_1","unstructured":"C. Baziotis , N. Pelekis , and C. Doulkeridis . 2017. DataStories at SemEval-2017 Task 4: Deep LSTM with attention for message-level and topic-based sentiment analysis . Proceedings of the 11th International Workshop on Semantic Evaluations (SemEval\u201917) . 747\u2013754. C. Baziotis, N. Pelekis, and C. Doulkeridis. 2017. DataStories at SemEval-2017 Task 4: Deep LSTM with attention for message-level and topic-based sentiment analysis. Proceedings of the 11th International Workshop on Semantic Evaluations (SemEval\u201917). 747\u2013754."}],"container-title":["ACM Transactions on Asian and Low-Resource Language Information Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3461764","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,2]],"date-time":"2023-01-02T01:51:35Z","timestamp":1672624295000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3461764"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,6,30]]},"references-count":45,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2021,9,30]]}},"alternative-id":["10.1145\/3461764"],"URL":"https:\/\/doi.org\/10.1145\/3461764","relation":{},"ISSN":["2375-4699","2375-4702"],"issn-type":[{"type":"print","value":"2375-4699"},{"type":"electronic","value":"2375-4702"}],"subject":[],"published":{"date-parts":[[2021,6,30]]},"assertion":[{"value":"2020-10-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-04-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-06-30","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}