{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,10,30]],"date-time":"2024-10-30T22:12:11Z","timestamp":1730326331550,"version":"3.28.0"},"publisher-location":"New York, NY, USA","reference-count":54,"publisher":"ACM","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2023,8,6]]},"DOI":"10.1145\/3580305.3599921","type":"proceedings-article","created":{"date-parts":[[2023,8,4]],"date-time":"2023-08-04T18:13:58Z","timestamp":1691172838000},"page":"5597-5607","update-policy":"http:\/\/dx.doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":14,"title":["TwHIN-BERT: A Socially-Enriched Pre-trained Language Model for Multilingual Tweet Representations at Twitter"],"prefix":"10.1145","author":[{"ORCID":"http:\/\/orcid.org\/0000-0001-6474-682X","authenticated-orcid":false,"given":"Xinyang","family":"Zhang","sequence":"first","affiliation":[{"name":"The University of Illinois at Urbana-Champaign, Urbana, IL, USA"}]},{"ORCID":"http:\/\/orcid.org\/0000-0003-4324-6433","authenticated-orcid":false,"given":"Yury","family":"Malkov","sequence":"additional","affiliation":[{"name":"Twitter Cortex, San Francisco, CA, USA"}]},{"ORCID":"http:\/\/orcid.org\/0009-0008-7884-8825","authenticated-orcid":false,"given":"Omar","family":"Florez","sequence":"additional","affiliation":[{"name":"Twitter Cortex, San Francisco, CA, USA"}]},{"ORCID":"http:\/\/orcid.org\/0009-0004-0131-245X","authenticated-orcid":false,"given":"Serim","family":"Park","sequence":"additional","affiliation":[{"name":"Twitter Cortex, San Francisco, CA, USA"}]},{"ORCID":"http:\/\/orcid.org\/0009-0002-7433-1702","authenticated-orcid":false,"given":"Brian","family":"McWilliams","sequence":"additional","affiliation":[{"name":"Twitter Cortex, San Francisco, CA, USA"}]},{"ORCID":"http:\/\/orcid.org\/0000-0002-3629-2696","authenticated-orcid":false,"given":"Jiawei","family":"Han","sequence":"additional","affiliation":[{"name":"The University of Illinois at Urbana-Champaign, Urbana, IL, USA"}]},{"ORCID":"http:\/\/orcid.org\/0000-0003-0121-7781","authenticated-orcid":false,"given":"Ahmed","family":"El-Kishky","sequence":"additional","affiliation":[{"name":"Twitter Cortex, San Francisco, CA, USA"}]}],"member":"320","published-online":{"date-parts":[[2023,8,4]]},"reference":[{"key":"e_1_3_2_2_1_1","volume-title":"Zuhair Khayyat, Manal Kalkatawi, Inji Ibrahim Jaber, and Xiangliang Zhang.","author":"Alharbi Basma","year":"2020","unstructured":"Basma Alharbi , Hind Alamro , Manal Abdulaziz Alshehri , Zuhair Khayyat, Manal Kalkatawi, Inji Ibrahim Jaber, and Xiangliang Zhang. 2020 . ASAD : A Twitter-based Benchmark Arabic Sentiment Analysis Dataset. ArXiv , Vol. abs\/ 2011 .00578 (2020). Basma Alharbi, Hind Alamro, Manal Abdulaziz Alshehri, Zuhair Khayyat, Manal Kalkatawi, Inji Ibrahim Jaber, and Xiangliang Zhang. 2020. ASAD: A Twitter-based Benchmark Arabic Sentiment Analysis Dataset. ArXiv, Vol. abs\/2011.00578 (2020)."},{"key":"e_1_3_2_2_2_1","volume-title":"Luis Espinosa Anke, and Jos\u00e9 Camacho-Collados","author":"Barbieri Francesco","year":"2021","unstructured":"Francesco Barbieri , Luis Espinosa Anke, and Jos\u00e9 Camacho-Collados . 2021 . XLM-T: A Multilingual Language Model Toolkit for Twitter. ArXiv , Vol. abs\/ 2104 .12250 (2021). Francesco Barbieri, Luis Espinosa Anke, and Jos\u00e9 Camacho-Collados. 2021. XLM-T: A Multilingual Language Model Toolkit for Twitter. ArXiv, Vol. abs\/2104.12250 (2021)."},{"key":"e_1_3_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/S18-1003"},{"key":"e_1_3_2_2_4_1","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00051"},{"key":"e_1_3_2_2_5_1","volume-title":"Translating embeddings for modeling multi-relational data. Advances in neural information processing systems","author":"Bordes Antoine","year":"2013","unstructured":"Antoine Bordes , Nicolas Usunier , Alberto Garcia-Duran , Jason Weston , and Oksana Yakhnenko . 2013. Translating embeddings for modeling multi-relational data. Advances in neural information processing systems , Vol. 26 ( 2013 ). Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. Advances in neural information processing systems, Vol. 26 (2013)."},{"key":"e_1_3_2_2_6_1","unstructured":"Tom Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared D Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell etal 2020. Language models are few-shot learners. Advances in neural information processing systems Vol. 33 (2020) 1877--1901. Tom Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared D Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell et al. 2020. Language models are few-shot learners. Advances in neural information processing systems Vol. 33 (2020) 1877--1901."},{"key":"e_1_3_2_2_7_1","doi-asserted-by":"crossref","unstructured":"S. Chang W. Han J. Tang G. Qi C. Aggarwal and T. Huang. 2015. Heterogeneous network embedding via deep architectures. In SIGKDD. 119--128. S. Chang W. Han J. Tang G. Qi C. Aggarwal and T. Huang. 2015. Heterogeneous network embedding via deep architectures. In SIGKDD. 119--128.","DOI":"10.1145\/2783258.2783296"},{"key":"e_1_3_2_2_8_1","volume-title":"Proceedings of the 37th International Conference on Machine Learning (Proceedings of Machine Learning Research","author":"Chen Ting","year":"2020","unstructured":"Ting Chen , Simon Kornblith , Mohammad Norouzi , and Geoffrey Hinton . 2020 . A Simple Framework for Contrastive Learning of Visual Representations . In Proceedings of the 37th International Conference on Machine Learning (Proceedings of Machine Learning Research , Vol. 119), Hal Daum\u00e9 III and Aarti Singh (Eds.). PMLR, 1597--1607. https:\/\/proceedings.mlr.press\/v119\/chen20j.html Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A Simple Framework for Contrastive Learning of Visual Representations. In Proceedings of the 37th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 119), Hal Daum\u00e9 III and Aarti Singh (Eds.). PMLR, 1597--1607. https:\/\/proceedings.mlr.press\/v119\/chen20j.html"},{"key":"e_1_3_2_2_9_1","doi-asserted-by":"crossref","unstructured":"T. Chen and Y. Sun. 2017. Task-guided and path-augmented heterogeneous network embedding for author identification. In WSDM. 295--304. T. Chen and Y. Sun. 2017. Task-guided and path-augmented heterogeneous network embedding for author identification. In WSDM. 295--304.","DOI":"10.1145\/3018661.3018735"},{"key":"e_1_3_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/2566486.2567997"},{"key":"e_1_3_2_2_11_1","volume-title":"8th International Conference on Learning Representations, ICLR 2020","author":"Clark Kevin","year":"2020","unstructured":"Kevin Clark , Minh-Thang Luong , Quoc V. Le , and Christopher D. Manning . 2020. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators . In 8th International Conference on Learning Representations, ICLR 2020 , Addis Ababa, Ethiopia , April 26-30, 2020 . OpenReview.net. https:\/\/openreview.net\/forum?id=r1xMH1BtvB Kevin Clark, Minh-Thang Luong, Quoc V. Le, and Christopher D. Manning. 2020. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net. https:\/\/openreview.net\/forum?id=r1xMH1BtvB"},{"key":"e_1_3_2_2_12_1","doi-asserted-by":"crossref","unstructured":"Alexis Conneau Kartikay Khandelwal Naman Goyal Vishrav Chaudhary Guillaume Wenzek Francisco Guzm\u00e1n Edouard Grave Myle Ott Luke Zettlemoyer and Veselin Stoyanov. 2020. Unsupervised Cross-lingual Representation Learning at Scale. In ACL. Alexis Conneau Kartikay Khandelwal Naman Goyal Vishrav Chaudhary Guillaume Wenzek Francisco Guzm\u00e1n Edouard Grave Myle Ott Luke Zettlemoyer and Veselin Stoyanov. 2020. Unsupervised Cross-lingual Representation Learning at Scale. In ACL.","DOI":"10.18653\/v1\/2020.acl-main.747"},{"key":"e_1_3_2_2_13_1","volume-title":"Cross-lingual Language Model Pretraining. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019","author":"Conneau Alexis","year":"2019","unstructured":"Alexis Conneau and Guillaume Lample . 2019 . Cross-lingual Language Model Pretraining. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019 , NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d'Alch\u00e9-Buc, Emily B. Fox, and Roman Garnett (Eds.). 7057--7067. https:\/\/proceedings.neurips.cc\/paper\/2019\/hash\/c04c19c2c2474dbf5f7ac4372c5b9af1-Abstract.html Alexis Conneau and Guillaume Lample. 2019. Cross-lingual Language Model Pretraining. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d'Alch\u00e9-Buc, Emily B. Fox, and Roman Garnett (Eds.). 7057--7067. https:\/\/proceedings.neurips.cc\/paper\/2019\/hash\/c04c19c2c2474dbf5f7ac4372c5b9af1-Abstract.html"},{"key":"e_1_3_2_2_14_1","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019","volume":"1","author":"Devlin Jacob","year":"2019","unstructured":"Jacob Devlin , Ming-Wei Chang , Kenton Lee , and Kristina Toutanova . 2019 . BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding . In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019 , Minneapolis, MN, USA , June 2-7, 2019, Volume 1 (Long and Short Papers), Jill Burstein, Christy Doran, and Thamar Solorio (Eds.). Association for Computational Linguistics, 4171--4186. https:\/\/doi.org\/10.18653\/v1\/n19--1423 10.18653\/v1 Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), Jill Burstein, Christy Doran, and Thamar Solorio (Eds.). Association for Computational Linguistics, 4171--4186. https:\/\/doi.org\/10.18653\/v1\/n19--1423"},{"key":"e_1_3_2_2_15_1","doi-asserted-by":"crossref","unstructured":"Y. Dong N. Chawla and A. Swami. 2017. metapath2vec: Scalable representation learning for heterogeneous networks. In SIGKDD. 135--144. Y. Dong N. Chawla and A. Swami. 2017. metapath2vec: Scalable representation learning for heterogeneous networks. In SIGKDD. 135--144.","DOI":"10.1145\/3097983.3098036"},{"key":"e_1_3_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/3534678.3542598"},{"key":"e_1_3_2_2_17_1","volume-title":"kNN-Embed: Locally Smoothed Embedding Mixtures For Multi-interest Candidate Retrieval. arXiv preprint arXiv:2205.06205","author":"El-Kishky Ahmed","year":"2022","unstructured":"Ahmed El-Kishky , Thomas Markovich , Kenny Leung , Frank Portman , and Aria Haghighi . 2022b. kNN-Embed: Locally Smoothed Embedding Mixtures For Multi-interest Candidate Retrieval. arXiv preprint arXiv:2205.06205 ( 2022 ). Ahmed El-Kishky, Thomas Markovich, Kenny Leung, Frank Portman, and Aria Haghighi. 2022b. kNN-Embed: Locally Smoothed Embedding Mixtures For Multi-interest Candidate Retrieval. arXiv preprint arXiv:2205.06205 (2022)."},{"key":"e_1_3_2_2_18_1","volume-title":"TwHIN: Embedding the Twitter Heterogeneous Information Network for Personalized Recommendation. In KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining","author":"El-Kishky Ahmed","year":"2022","unstructured":"Ahmed El-Kishky , Thomas Markovich , Serim Park , Chetan Verma , Baekjin Kim , Ramy Eskander , Yury Malkov , Frank Portman , Sof\u00eda Samaniego , Ying Xiao , and Aria Haghighi . 2022 . TwHIN: Embedding the Twitter Heterogeneous Information Network for Personalized Recommendation. In KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining , Washington, DC, USA, August 14 - 18 , 2022, Aidong Zhang and Huzefa Rangwala (Eds.). ACM, 2842--2850. https:\/\/doi.org\/10.1145\/3534678.3539080 10.1145\/3534678.3539080 Ahmed El-Kishky, Thomas Markovich, Serim Park, Chetan Verma, Baekjin Kim, Ramy Eskander, Yury Malkov, Frank Portman, Sof\u00eda Samaniego, Ying Xiao, and Aria Haghighi. 2022. TwHIN: Embedding the Twitter Heterogeneous Information Network for Personalized Recommendation. In KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14 - 18, 2022, Aidong Zhang and Huzefa Rangwala (Eds.). ACM, 2842--2850. https:\/\/doi.org\/10.1145\/3534678.3539080"},{"key":"e_1_3_2_2_19_1","volume-title":"Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity. CoRR","author":"Fedus William","year":"2021","unstructured":"William Fedus , Barret Zoph , and Noam Shazeer . 2021 . Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity. CoRR , Vol. abs\/ 2101 .03961 (2021). showeprint[arXiv]2101.03961 https:\/\/arxiv.org\/abs\/2101.03961 William Fedus, Barret Zoph, and Noam Shazeer. 2021. Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity. CoRR, Vol. abs\/2101.03961 (2021). showeprint[arXiv]2101.03961 https:\/\/arxiv.org\/abs\/2101.03961"},{"key":"e_1_3_2_2_20_1","unstructured":"Y. Goldberg and O. Levy. 2014. word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method. arXiv preprint arXiv:1402.3722 (2014). Y. Goldberg and O. Levy. 2014. word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method. arXiv preprint arXiv:1402.3722 (2014)."},{"key":"e_1_3_2_2_21_1","doi-asserted-by":"crossref","unstructured":"A. Grover and J. Leskovec. 2016. node2vec: Scalable feature learning for networks. In SIGKDD. 855--864. A. Grover and J. Leskovec. 2016. node2vec: Scalable feature learning for networks. In SIGKDD. 855--864.","DOI":"10.1145\/2939672.2939754"},{"key":"e_1_3_2_2_22_1","volume-title":"Product quantization for nearest neighbor search","author":"Jegou Herve","year":"2010","unstructured":"Herve Jegou , Matthijs Douze , and Cordelia Schmid . 2010. Product quantization for nearest neighbor search . IEEE transactions on pattern analysis and machine intelligence, Vol. 33 , 1 ( 2010 ), 117--128. Herve Jegou, Matthijs Douze, and Cordelia Schmid. 2010. Product quantization for nearest neighbor search. IEEE transactions on pattern analysis and machine intelligence, Vol. 33, 1 (2010), 117--128."},{"key":"e_1_3_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/TBDATA.2019.2921572"},{"key":"e_1_3_2_2_24_1","volume-title":"Pytorch-biggraph: A large-scale graph embedding system. arXiv preprint arXiv:1903.12287","author":"Lerer Adam","year":"2019","unstructured":"Adam Lerer , Ledell Wu , Jiajun Shen , Timothee Lacroix , Luca Wehrstedt , Abhijit Bose , and Alex Peysakhovich . 2019 . Pytorch-biggraph: A large-scale graph embedding system. arXiv preprint arXiv:1903.12287 (2019). Adam Lerer, Ledell Wu, Jiajun Shen, Timothee Lacroix, Luca Wehrstedt, Abhijit Bose, and Alex Peysakhovich. 2019. Pytorch-biggraph: A large-scale graph embedding system. arXiv preprint arXiv:1903.12287 (2019)."},{"key":"e_1_3_2_2_25_1","unstructured":"Weijie Liu Peng Zhou Zhe Zhao Zhiruo Wang Qi Ju Haotang Deng and Ping Wang. 2020. K-BERT: Enabling Language Representation with Knowledge Graph. In AAAI. Weijie Liu Peng Zhou Zhe Zhao Zhiruo Wang Qi Ju Haotang Deng and Ping Wang. 2020. K-BERT: Enabling Language Representation with Knowledge Graph. In AAAI."},{"key":"e_1_3_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/3534678.3539210"},{"key":"e_1_3_2_2_27_1","volume-title":"RoBERTa: A Robustly Optimized BERT Pretraining Approach. CoRR","author":"Liu Yinhan","year":"2019","unstructured":"Yinhan Liu , Myle Ott , Naman Goyal , Jingfei Du , Mandar Joshi , Danqi Chen , Omer Levy , Mike Lewis , Luke Zettlemoyer , and Veselin Stoyanov . 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. CoRR , Vol. abs\/ 1907 .11692 ( 2019 ). showeprint[arXiv]1907.11692 http:\/\/arxiv.org\/abs\/1907.11692 Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. CoRR, Vol. abs\/1907.11692 (2019). showeprint[arXiv]1907.11692 http:\/\/arxiv.org\/abs\/1907.11692"},{"key":"e_1_3_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.acl-demo.25"},{"key":"e_1_3_2_2_29_1","volume-title":"Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems","author":"Meng Yu","year":"2021","unstructured":"Yu Meng , Chenyan Xiong , Payal Bajaj , Saurabh Tiwary , Paul Bennett , Jiawei Han , and Xia Song . 2021. COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining . In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021 , NeurIPS 2021, December 6-14, 2021, virtual, Marc'Aurelio Ranzato, Alina Beygelzimer, Yann N. Dauphin, Percy Liang, and Jennifer Wortman Vaughan (Eds .). 23102--23114. https:\/\/proceedings.neurips.cc\/paper\/2021\/hash\/c2c2a04512b35d13102459f8784f1a2d-Abstract.html Yu Meng, Chenyan Xiong, Payal Bajaj, Saurabh Tiwary, Paul Bennett, Jiawei Han, and Xia Song. 2021. COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining. In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, Marc'Aurelio Ranzato, Alina Beygelzimer, Yann N. Dauphin, Percy Liang, and Jennifer Wortman Vaughan (Eds.). 23102--23114. https:\/\/proceedings.neurips.cc\/paper\/2021\/hash\/c2c2a04512b35d13102459f8784f1a2d-Abstract.html"},{"key":"e_1_3_2_2_30_1","volume-title":"NeurIPS","volume":"26","author":"Mikolov T.","year":"2013","unstructured":"T. Mikolov , I. Sutskever , K. Chen , G. Corrado , and J. Dean . 2013. Distributed representations of words and phrases and their compositionality . NeurIPS , Vol. 26 ( 2013 ). T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean. 2013. Distributed representations of words and phrases and their compositionality. NeurIPS, Vol. 26 (2013)."},{"key":"e_1_3_2_2_31_1","volume-title":"Non-Parametric Temporal Adaptation for Social Media Topic Classification. arXiv preprint arXiv:2209.05706","author":"Mireshghallah Fatemehsadat","year":"2022","unstructured":"Fatemehsadat Mireshghallah , Nikolai Vogler , Junxian He , Omar Florez , Ahmed El-Kishky , and Taylor Berg-Kirkpatrick . 2022. Non-Parametric Temporal Adaptation for Social Media Topic Classification. arXiv preprint arXiv:2209.05706 ( 2022 ). Fatemehsadat Mireshghallah, Nikolai Vogler, Junxian He, Omar Florez, Ahmed El-Kishky, and Taylor Berg-Kirkpatrick. 2022. Non-Parametric Temporal Adaptation for Social Media Topic Classification. arXiv preprint arXiv:2209.05706 (2022)."},{"key":"e_1_3_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.emnlp-demos.2"},{"key":"e_1_3_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.semeval-1.100"},{"key":"e_1_3_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/2623330.2623732"},{"key":"e_1_3_2_2_35_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N18-1202"},{"key":"e_1_3_2_2_36_1","unstructured":"Alec Radford Jeffrey Wu Rewon Child David Luan Dario Amodei Ilya Sutskever etal 2019. Language models are unsupervised multitask learners. OpenAI blog Vol. 1 8 (2019) 9. Alec Radford Jeffrey Wu Rewon Child David Luan Dario Amodei Ilya Sutskever et al. 2019. Language models are unsupervised multitask learners. OpenAI blog Vol. 1 8 (2019) 9."},{"key":"e_1_3_2_2_37_1","article-title":"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer","volume":"21","author":"Raffel Colin","year":"2020","unstructured":"Colin Raffel , Noam Shazeer , Adam Roberts , Katherine Lee , Sharan Narang , Michael Matena , Yanqi Zhou , Wei Li , and Peter J. Liu . 2020 . Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer . J. Mach. Learn. Res. , Vol. 21 (2020), 140:1--140:67. http:\/\/jmlr.org\/papers\/v21\/20-074.html Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. J. Mach. Learn. Res., Vol. 21 (2020), 140:1--140:67. http:\/\/jmlr.org\/papers\/v21\/20-074.html","journal-title":"J. Mach. Learn. Res."},{"key":"e_1_3_2_2_38_1","volume-title":"SemEval-2017 Task 4: Sentiment Analysis in Twitter. CoRR","author":"Rosenthal Sara","year":"2019","unstructured":"Sara Rosenthal , Noura Farra , and Preslav Nakov . 2019. SemEval-2017 Task 4: Sentiment Analysis in Twitter. CoRR , Vol. abs\/ 1912 .00741 ( 2019 ). [arXiv]1912.00741 http:\/\/arxiv.org\/abs\/1912.00741 Sara Rosenthal, Noura Farra, and Preslav Nakov. 2019. SemEval-2017 Task 4: Sentiment Analysis in Twitter. CoRR, Vol. abs\/1912.00741 (2019). [arXiv]1912.00741 http:\/\/arxiv.org\/abs\/1912.00741"},{"key":"e_1_3_2_2_39_1","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL\/IJCNLP 2021","author":"Rust Phillip","year":"2021","unstructured":"Phillip Rust , Jonas Pfeiffer , Ivan Vulic , Sebastian Ruder , and Iryna Gurevych . 2021 . How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models . In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL\/IJCNLP 2021 , (Volume 1: Long Papers), Virtual Event , August 1-6, 2021, Chengqing Zong, Fei Xia, Wenjie Li, and Roberto Navigli (Eds.). Association for Computational Linguistics, 3118--3135. https:\/\/doi.org\/10.18653\/v1\/2021.acl-long.243 10.18653\/v1 Phillip Rust, Jonas Pfeiffer, Ivan Vulic, Sebastian Ruder, and Iryna Gurevych. 2021. How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL\/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, August 1-6, 2021, Chengqing Zong, Fei Xia, Wenjie Li, and Roberto Navigli (Eds.). Association for Computational Linguistics, 3118--3135. https:\/\/doi.org\/10.18653\/v1\/2021.acl-long.243"},{"key":"e_1_3_2_2_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/3336191.3371811"},{"key":"e_1_3_2_2_41_1","volume-title":"Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism. CoRR","author":"Shoeybi Mohammad","year":"2019","unstructured":"Mohammad Shoeybi , Mostofa Patwary , Raul Puri , Patrick LeGresley , Jared Casper , and Bryan Catanzaro . 2019. Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism. CoRR , Vol. abs\/ 1909 .08053 ( 2019 ). [arXiv]1909.08053 http:\/\/arxiv.org\/abs\/1909.08053 Mohammad Shoeybi, Mostofa Patwary, Raul Puri, Patrick LeGresley, Jared Casper, and Bryan Catanzaro. 2019. Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism. CoRR, Vol. abs\/1909.08053 (2019). [arXiv]1909.08053 http:\/\/arxiv.org\/abs\/1909.08053"},{"key":"e_1_3_2_2_42_1","doi-asserted-by":"crossref","unstructured":"Y. Sun and J. Han. 2013. Mining heterogeneous information networks: a structural analysis approach. Acm Sigkdd Explorations Newsletter (2013). Y. Sun and J. Han. 2013. Mining heterogeneous information networks: a structural analysis approach. Acm Sigkdd Explorations Newsletter (2013).","DOI":"10.1145\/2481244.2481248"},{"key":"e_1_3_2_2_43_1","doi-asserted-by":"publisher","DOI":"10.2197\/ipsjjip.27.404"},{"key":"e_1_3_2_2_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/2783258.2783307"},{"key":"e_1_3_2_2_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/2736277.2741093"},{"key":"e_1_3_2_2_46_1","unstructured":"T. Trouillon J Welbl S. Riedel \u00c9. Gaussier and G. Bouchard. 2016. Complex embeddings for simple link prediction. In ICML. PMLR 2071--2080. T. Trouillon J Welbl S. Riedel \u00c9. Gaussier and G. Bouchard. 2016. Complex embeddings for simple link prediction. In ICML. PMLR 2071--2080."},{"key":"e_1_3_2_2_47_1","volume-title":"Attention is all you need. Advances in neural information processing systems","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani , Noam Shazeer , Niki Parmar , Jakob Uszkoreit , Llion Jones , Aidan N Gomez , \u0141ukasz Kaiser , and Illia Polosukhin . 2017. Attention is all you need. Advances in neural information processing systems , Vol. 30 ( 2017 ). Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, \u0141ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems, Vol. 30 (2017)."},{"key":"e_1_3_2_2_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2017.2754499"},{"key":"e_1_3_2_2_49_1","doi-asserted-by":"crossref","unstructured":"L. Xu X. Wei J. Cao and P. Yu. 2017. Embedding of embedding (EOE) joint embedding for coupled heterogeneous networks. In WSDM. 741--749. L. Xu X. Wei J. Cao and P. Yu. 2017. Embedding of embedding (EOE) joint embedding for coupled heterogeneous networks. In WSDM. 741--749.","DOI":"10.1145\/3018661.3018723"},{"key":"e_1_3_2_2_50_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.naacl-main.41"},{"key":"e_1_3_2_2_51_1","doi-asserted-by":"publisher","DOI":"10.31193\/ssap.01.9787509752807"},{"key":"e_1_3_2_2_52_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.acl-long.551"},{"key":"e_1_3_2_2_53_1","doi-asserted-by":"publisher","DOI":"10.1145\/3219819.3219890"},{"key":"e_1_3_2_2_54_1","series-title":"ERNIE: Enhanced Language Representation with Informative Entities. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL","volume-title":"Long Papers, Anna Korhonen, David R. Traum, and Llu'i s M\u00e0rquez (Eds.)","author":"Zhang Zhengyan","year":"2019","unstructured":"Zhengyan Zhang , Xu Han , Zhiyuan Liu , Xin Jiang , Maosong Sun , and Qun Liu . 2019 . ERNIE: Enhanced Language Representation with Informative Entities. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence , Italy, July 28- August 2, 2019, Volume 1 : Long Papers, Anna Korhonen, David R. Traum, and Llu'i s M\u00e0rquez (Eds.) . Association for Computational Linguistics , 1441--1451. https:\/\/doi.org\/10.18653\/v1\/p19-1139 10.18653\/v1 Zhengyan Zhang, Xu Han, Zhiyuan Liu, Xin Jiang, Maosong Sun, and Qun Liu. 2019. ERNIE: Enhanced Language Representation with Informative Entities. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers, Anna Korhonen, David R. Traum, and Llu'i s M\u00e0rquez (Eds.). Association for Computational Linguistics, 1441--1451. https:\/\/doi.org\/10.18653\/v1\/p19-1139"}],"event":{"name":"KDD '23: The 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining","sponsor":["SIGMOD ACM Special Interest Group on Management of Data","SIGKDD ACM Special Interest Group on Knowledge Discovery in Data"],"location":"Long Beach CA USA","acronym":"KDD '23"},"container-title":["Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3580305.3599921","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,9,9]],"date-time":"2023-09-09T05:31:57Z","timestamp":1694237517000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3580305.3599921"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,8,4]]},"references-count":54,"alternative-id":["10.1145\/3580305.3599921","10.1145\/3580305"],"URL":"https:\/\/doi.org\/10.1145\/3580305.3599921","relation":{},"subject":[],"published":{"date-parts":[[2023,8,4]]},"assertion":[{"value":"2023-08-04","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}