{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,8,8]],"date-time":"2024-08-08T13:40:28Z","timestamp":1723124428434},"reference-count":31,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2022,10,28]],"date-time":"2022-10-28T00:00:00Z","timestamp":1666915200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"P-Direkt, Ministry of the Interior and Kingdom Relations"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"We explore the use case of question answering (QA) by a contact centre for 130,000 Dutch government employees in the domain of questions about human resources (HR). HR questions can be answered using personnel files or general documentation, with the latter being the focus of the current research. We created a Dutch HR QA dataset with over 300 questions in the format of the Squad 2.0 dataset, which distinguishes between answerable and unanswerable questions. We applied various BERT-based models, either directly or after finetuning on the new dataset. The F1-scores reached 0.47 for unanswerable questions and 1.0 for answerable questions depending on the topic; however, large variations in scores were observed. We conclude more data are needed to further improve the performance of this task.<\/jats:p>","DOI":"10.3390\/info13110513","type":"journal-article","created":{"date-parts":[[2022,10,30]],"date-time":"2022-10-30T11:26:50Z","timestamp":1667129210000},"page":"513","source":"Crossref","is-referenced-by-count":0,"title":["Exploring the Utility of Dutch Question Answering Datasets for Human Resource Contact Centres"],"prefix":"10.3390","volume":"13","author":[{"ORCID":"http:\/\/orcid.org\/0000-0001-6808-6291","authenticated-orcid":false,"given":"Cha\u00efm","family":"van Toledo","sequence":"first","affiliation":[{"name":"Department of Information and Computing Sciences, Utrecht University, Princetonplein 5, 3584 CC Utrecht, The Netherlands"}]},{"given":"Marijn","family":"Schraagen","sequence":"additional","affiliation":[{"name":"Department of Information and Computing Sciences, Utrecht University, Princetonplein 5, 3584 CC Utrecht, The Netherlands"}]},{"ORCID":"http:\/\/orcid.org\/0000-0001-8696-1045","authenticated-orcid":false,"given":"Friso","family":"van Dijk","sequence":"additional","affiliation":[{"name":"Department of Information and Computing Sciences, Utrecht University, Princetonplein 5, 3584 CC Utrecht, The Netherlands"}]},{"ORCID":"http:\/\/orcid.org\/0000-0003-1054-6683","authenticated-orcid":false,"given":"Matthieu","family":"Brinkhuis","sequence":"additional","affiliation":[{"name":"Department of Information and Computing Sciences, Utrecht University, Princetonplein 5, 3584 CC Utrecht, The Netherlands"}]},{"ORCID":"http:\/\/orcid.org\/0000-0002-9237-221X","authenticated-orcid":false,"given":"Marco","family":"Spruit","sequence":"additional","affiliation":[{"name":"Leiden Institute of Advanced Computer Science, Leiden University, Niels Bohrweg 1, 2333 CA Leiden, The Netherlands"}]}],"member":"1968","published-online":{"date-parts":[[2022,10,28]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Rajpurkar, P., Jia, R., and Liang, P. (2018). Know What You Don\u2019t Know: Unanswerable Questions for SQuAD. arXiv.","DOI":"10.18653\/v1\/P18-2124"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Poibeau, T., Saggion, H., Piskorski, J., and Yangarber, R. (2013). Entity Linking: Finding Extracted Entities in a Knowledge Base. Multi-Source, Multilingual Information Extraction and Summarization, Springer.","DOI":"10.1007\/978-3-642-28569-1"},{"key":"ref_3","unstructured":"Zhang, Z., Yang, J., and Zhao, H. (2021, January 2\u20139). Retrospective reader for machine reading comprehension. Proceedings of the AAAI Conference on Artificial Intelligence, Virtually."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Trischler, A., Wang, T., Yuan, X., Harris, J., Sordoni, A., Bachman, P., and Suleman, K. (2017, January 3). NewsQA: A Machine Comprehension Dataset. Proceedings of the 2nd Workshop on Representation Learning for NLP, Vancouver, BC, Canada.","DOI":"10.18653\/v1\/W17-2623"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Rajpurkar, P., Zhang, J., Lopyrev, K., and Liang, P. (2016). SQuAD: 100,000+ Questions for Machine Comprehension of Text. arXiv.","DOI":"10.18653\/v1\/D16-1264"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Choi, E., He, H., Iyyer, M., Yatskar, M., Yih, W.t., Choi, Y., Liang, P., and Zettlemoyer, L. (November, January 31). QuAC: Question Answering in Context. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium.","DOI":"10.18653\/v1\/D18-1241"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"249","DOI":"10.1162\/tacl_a_00266","article-title":"CoQA: A Conversational Question Answering Challenge","volume":"7","author":"Reddy","year":"2019","journal-title":"Trans. Assoc. Comput. Linguist."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Yagcioglu, S., Erdem, A., Erdem, E., and Ikizler-Cinbis, N. (November, January 31). RecipeQA: A Challenge Dataset for Multimodal Comprehension of Cooking Recipes. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium.","DOI":"10.18653\/v1\/D18-1166"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Castelli, V., Chakravarti, R., Dana, S., Ferritto, A., Florian, R., Franz, M., Garg, D., Khandelwal, D., McCarley, J.S., and McCawley, M. (2020, January 5\u201310). The TechQA Dataset. Proceedings of the Association for Computational Linguistics (ACL), Seattle, WA, USA.","DOI":"10.18653\/v1\/2020.acl-main.117"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Zhong, H., Xiao, C., Tu, C., Zhang, T., Liu, Z., and Sun, M. (2019). JEC-QA: A Legal-Domain Question Answering Dataset. arXiv.","DOI":"10.1609\/aaai.v34i05.6519"},{"key":"ref_11","unstructured":"Carrino, C.P., Costa-juss\u00e0, M.R., and Fonollosa, J.A.R. (2020, January 11\u201316). Automatic Spanish Translation of SQuAD Dataset for Multi-lingual Question Answering. Proceedings of the 12th Language Resources and Evaluation Conference, Marseille, France."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"d\u2019Hoffschmidt, M., Belblidia, W., Heinrich, Q., Brendl\u00e9, T., and Vidal, M. (2020). FQuAD: French Question Answering Dataset. Findings of the Association for Computational Linguistics: EMNLP 2020, Association for Computational Linguistics.","DOI":"10.18653\/v1\/2020.findings-emnlp.107"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Abadani, N., Mozafari, J., Fatemi, A., Nematbakhsh, M.A., and Kazemi, A. (2021, January 19\u201320). ParSQuAD: Machine Translated SQuAD dataset for Persian Question Answering. Proceedings of the 2021 7th International Conference on Web Research (ICWR), Tehran, Iran.","DOI":"10.1109\/ICWR51868.2021.9443126"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Mozannar, H., Maamary, E., El Hajal, K., and Hajj, H. (2019, January 28). Neural Arabic Question Answering. Proceedings of the Fourth Arabic Natural Language Processing Workshop, Florence, Italy.","DOI":"10.18653\/v1\/W19-4612"},{"key":"ref_15","unstructured":"Rogers, A., Gardner, M., and Augenstein, I. (2021). QA Dataset Explosion: A Taxonomy of NLP Resources for Question Answering and Reading Comprehension. arXiv."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"23","DOI":"10.5121\/ijnlc.2020.9602","article-title":"Dutch Named Entity Recognition and De-Identification Methods for the Human Resource Domain","volume":"9","author":"Spruit","year":"2020","journal-title":"Int. J. Nat. Lang. Comput."},{"key":"ref_17","unstructured":"Kouzis-Loukas, D. (2016). Learning Scrapy, Packt Publishing Ltd."},{"key":"ref_18","unstructured":"Richardson, L. (2022, August 21). Beautiful Soup Documentation. Available online: https:\/\/www.crummy.com\/software\/BeautifulSoup\/bs4\/doc\/."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Delobelle, P., Winters, T., and Berendt, B. (2020). RobBERT: A Dutch RoBERTa-based Language Model. Findings of the Association for Computational Linguistics: EMNLP 2020, Association for Computational Linguistics.","DOI":"10.18653\/v1\/2020.findings-emnlp.292"},{"key":"ref_20","unstructured":"Honnibal, M., Montani, I., Van Landeghem, S., and Boyd, A. (2022). spaCy: Industrial-Strength Natural Language Processing in Python, Zenodo."},{"key":"ref_21","unstructured":"Reeve, J. (2020). Text-Matcher, GitHub Repository."},{"key":"ref_22","unstructured":"Pander Maat, H., Kraf, R., and Dekker, N. (2022, August 20). Handleiding T-Scan 2014. Available online: https:\/\/raw.githubusercontent.com\/proycon\/tscan\/master\/docs\/tscanhandleiding.pdf."},{"key":"ref_23","unstructured":"Devlin, J., Chang, M.W., Lee, K., and Toutanova, K. (2019, January 2\u20137). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, MN, USA."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Xia, P., Wu, S., and Durme, B.V. (2020, January 16\u201320). Which *BERT? A Survey Organizing Contextualized Encoders. Proceedings of the EMNLP, Online.","DOI":"10.18653\/v1\/2020.emnlp-main.608"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"5789","DOI":"10.1007\/s10462-021-09958-2","article-title":"Transformer models for text-based emotion detection: A review of BERT-based approaches","volume":"54","author":"Acheampong","year":"2021","journal-title":"Artif. Intell. Rev."},{"key":"ref_26","unstructured":"de Vries, W., van Cranenburgh, A., Bisazza, A., Caselli, T., Noord, G.v., and Nissim, M. (2019). BERTje: A Dutch BERT Model. arXiv."},{"key":"ref_27","unstructured":"Brandsen, A., Dirkson, A., Verberne, S., Sappelli, M., Manh Chu, D., and Stoutjesdijk, K. (2019, January 23\u201327). BERT-NL a set of language models pre-trained on the Dutch SoNaR corpus. Proceedings of the Dutch-Belgian Information Retrieval Conference (DIR 2019), Wuhan, China."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Ma, C., Jernite, Y., and Plu, J. (2020, January 16\u201320). Transformers: State-of-the-Art Natural Language Processing. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Online.","DOI":"10.18653\/v1\/2020.emnlp-demos.6"},{"key":"ref_29","unstructured":"Borzymowski, H. (2020, January 10\u201312). henryk\/bert-base-multilingual-cased-finetuned-dutch-squad2 \u00b7 Hugging Face. Proceedings of the Benelux Conference on Artificial Intelligence, Esch-sur-Alzette, Luxembourg."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Ohsugi, Y., Saito, I., Nishida, K., Asano, H., and Tomita, J. (2019, January 1). A Simple but Effective Method to Incorporate Multi-turn Context with BERT for Conversational Machine Comprehension. Proceedings of the First Workshop on NLP for Conversational AI, Florence, Italy.","DOI":"10.18653\/v1\/W19-4102"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Qu, C., Yang, L., Qiu, M., Croft, W.B., Zhang, Y., and Iyyer, M. (2019, January 21\u201325). BERT with History Answer Embedding for Conversational Question Answering. Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Paris, France.","DOI":"10.1145\/3331184.3331341"}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/13\/11\/513\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,8]],"date-time":"2024-08-08T12:44:16Z","timestamp":1723121056000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/13\/11\/513"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,10,28]]},"references-count":31,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2022,11]]}},"alternative-id":["info13110513"],"URL":"https:\/\/doi.org\/10.3390\/info13110513","relation":{},"ISSN":["2078-2489"],"issn-type":[{"type":"electronic","value":"2078-2489"}],"subject":[],"published":{"date-parts":[[2022,10,28]]}}}