{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T23:36:52Z","timestamp":1740181012602,"version":"3.37.3"},"reference-count":36,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2023,9,22]],"date-time":"2023-09-22T00:00:00Z","timestamp":1695340800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100000038","name":"Natural Sciences and Engineering Research Council of Canada","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100000038","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100003151","name":"Fonds de recherche du Qu\u00e9bec - Nature et Technologies","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100003151","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Artif. Intell."],"abstract":"In the field of automatic text simplification, assessing whether or not the meaning of the original text has been preserved during simplification is of paramount importance. Metrics relying on n-gram overlap assessment may struggle to deal with simplifications which replace complex phrases with their simpler paraphrases. Current evaluation metrics for meaning preservation based on large language models (LLMs), such as BertScore in machine translation or QuestEval in summarization, have been proposed. However, none has a strong correlation with human judgment of meaning preservation. Moreover, such metrics have not been assessed in the context of text simplification research. In this study, we present a meta-evaluation of several metrics we apply to measure content similarity in text simplification. We also show that the metrics are unable to pass two trivial, inexpensive content preservation tests. Another contribution of this study is MeaningBERT (https:\/\/github.com\/GRAAL-Research\/MeaningBERT<\/jats:ext-link>), a new trainable metric designed to assess meaning preservation between two sentences in text simplification, showing how it correlates with human judgment. To demonstrate its quality and versatility, we will also present a compilation of datasets used to assess meaning preservation and benchmark our study against a large selection of popular metrics.<\/jats:p>","DOI":"10.3389\/frai.2023.1223924","type":"journal-article","created":{"date-parts":[[2023,9,22]],"date-time":"2023-09-22T19:08:52Z","timestamp":1695409732000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["MeaningBERT: assessing meaning preservation between sentences"],"prefix":"10.3389","volume":"6","author":[{"given":"David","family":"Beauchemin","sequence":"first","affiliation":[]},{"given":"Horacio","family":"Saggion","sequence":"additional","affiliation":[]},{"given":"Richard","family":"Khoury","sequence":"additional","affiliation":[]}],"member":"1965","published-online":{"date-parts":[[2023,9,22]]},"reference":[{"key":"B1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.424","article-title":"\u201cASSET: a dataset for tuning and evaluation of sentence simplification models with multiple rewriting transformations,\u201d","author":"Alva-Manchego","year":"2020","journal-title":"Annual Meeting of the Association for Computational Linguistics"},{"key":"B2","doi-asserted-by":"publisher","first-page":"861","DOI":"10.1162\/coli_a_00418","article-title":"The (un)suitability of automatic evaluation metrics for text simplification","volume":"47","author":"Alva-Manchego","year":"2021","journal-title":"Comput. Linguist"},{"key":"B3","first-page":"65","article-title":"\u201cMETEOR: an automatic metric for MT evaluation with improved correlation with human judgments,\u201d","author":"Banerjee","year":"2005","journal-title":"Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and\/or Summarization"},{"key":"B4","first-page":"1877","article-title":"Language models are few-shot learners","volume":"33","author":"Brown","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst"},{"key":"B5","first-page":"4171","article-title":"\u201cBERT: pre-training of deep bidirectional transformers for language understanding,\u201d","author":"Devlin","year":"2019","journal-title":"Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies"},{"key":"B6","doi-asserted-by":"publisher","first-page":"391","DOI":"10.1162\/tacl_a_00373","article-title":"Summeval: re-evaluating summarization evaluation","volume":"9","author":"Fabbri","year":"2021","journal-title":"Trans. Assoc. Comput. Linguist"},{"key":"B7","first-page":"344","article-title":"A readability formula in practice","volume":"25","author":"Flesch","year":"1948","journal-title":"Elem. English"},{"key":"B8","doi-asserted-by":"publisher","first-page":"65","DOI":"10.1613\/jair.5477","article-title":"Survey of the state of the art in natural language generation: core tasks, applications and evaluation","volume":"61","author":"Gatt","year":"2018","journal-title":"J. Artif. Intell. Res"},{"key":"B9","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1177\/002194366900600202","article-title":"The fog index after twenty years","volume":"6","author":"Gunning","year":"1969","journal-title":"J. Bus. Commun"},{"key":"B10","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4614-7138-7","author":"James","year":"2013","journal-title":"An Introduction to Statistical Learning"},{"key":"B11","doi-asserted-by":"publisher","DOI":"10.21236\/ADA006655","author":"Kincaid","year":"1975","journal-title":"Derivation of New Readability Formulas (Automated Readability Index, Fog Count and Flesch Reading Ease Formula) For Navy Enlisted Personnel"},{"key":"B12","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.460","article-title":"\u201cThe summary loop: learning to write abstractive summaries without examples,\u201d","author":"Laban","year":"2020","journal-title":"Proceedings of the Annual Meeting of the Association for Computational Linguistics"},{"key":"B13","first-page":"74","article-title":"\u201cROUGE: a package for automatic evaluation of summaries,\u201d","author":"Lin","year":"2004"},{"key":"B14","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2212.09739","article-title":"LENS: a learnable evaluation metric for text simplification","author":"Maddela","year":"2022","journal-title":"arXiv"},{"key":"B15","first-page":"639","article-title":"SMOG grading-a new readability formula","volume":"12","author":"Mc Laughlin","year":"1969","journal-title":"J. Read"},{"article-title":"\u201cOn the stability of fine-tuning BERT: misconceptions, explanations, and strong baselines,\u201d","year":"2021","author":"Mosbach","key":"B16"},{"journal-title":"Global Autonomous Language Exploitation (GALE)","year":"2005","author":"Olive","key":"B17"},{"key":"B18","doi-asserted-by":"publisher","first-page":"311","DOI":"10.3115\/1073083.1073135","article-title":"\u201cBLEU: a method for automatic evaluation of machine translation,\u201d","author":"Papineni","year":"2002"},{"key":"B19","doi-asserted-by":"publisher","first-page":"8029","DOI":"10.18653\/v1\/2021.emnlp-main.633","article-title":"\u201cData-QuestEval: a referenceless metric for data-to-text semantic evaluation,\u201d","author":"Rebuffel","year":"2021"},{"key":"B20","doi-asserted-by":"publisher","first-page":"3982","DOI":"10.18653\/v1\/D19-1410","article-title":"\u201cSentence-BERT: sentence embeddings using siamese BERT-networks,\u201d","author":"Reimers","year":"2019"},{"key":"B21","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/978-3-031-02166-4","article-title":"Automatic text simplification","volume":"10","author":"Saggion","year":"2017","journal-title":"Synth. Lect. Hum. Lang. Technol"},{"key":"B22","doi-asserted-by":"publisher","first-page":"6594","DOI":"10.18653\/v1\/2021.emnlp-main.529","article-title":"\u201cQuestEval: summarization asks for fact-based evaluation,\u201d","author":"Scialom","year":""},{"key":"B23","doi-asserted-by":"publisher","first-page":"3237","DOI":"10.18653\/v1\/D19-1320","article-title":"\u201cAnswers Unite! Unsupervised metrics for reinforced summarization models,\u201d","author":"Scialom","year":"2019"},{"key":"B24","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2104.07560","article-title":"Rethinking automatic evaluation in sentence simplification","author":"Scialom","year":"","journal-title":"arXiv"},{"key":"B25","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.704","article-title":"\u201cBLEURT: learning robust metrics for text generation,\u201d","author":"Sellam","year":"2020","journal-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics"},{"key":"B26","doi-asserted-by":"publisher","first-page":"738","DOI":"10.18653\/v1\/D18-1081","article-title":"\u201cBLEU is not suitable for the evaluation of text simplification,\u201d","author":"Sulem","year":""},{"key":"B27","doi-asserted-by":"publisher","first-page":"685","DOI":"10.18653\/v1\/N18-1063","article-title":"\u201cSemantic structural evaluation for text simplification,\u201d","author":"Sulem","year":""},{"key":"B28","doi-asserted-by":"publisher","first-page":"115","DOI":"10.1017\/CBO9780511667268.008","article-title":"Comparing measures of lexical richness","volume":"93","author":"Van Hout","year":"2007","journal-title":"Model. Assess. Vocabulary Knowledge"},{"key":"B29","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.eval4nlp-1.2","article-title":"\u201cFill in the BLANC: human-free quality estimation of document summaries,\u201d","author":"Vasilyev","year":"2020","journal-title":"Proceedings of the Evaluation and Comparison of NLP Systems Workshop"},{"key":"B30","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1910.03771","article-title":"HuggingFace's transformers: state-of-the-art natural language processing","author":"Wolf","year":"2020","journal-title":"arXiv"},{"key":"B31","first-page":"1015","article-title":"\u201cSentence simplification by monolingual machine translation,\u201d","author":"Wubben","year":"2012","journal-title":"Proceedings of the Annual Meeting of the Association for Computational Linguistics"},{"key":"B32","doi-asserted-by":"publisher","first-page":"283","DOI":"10.1162\/tacl_a_00139","article-title":"Problems in current text simplification research: new data can help","volume":"3","author":"Xu","year":"2015","journal-title":"Trans. Assoc. Comput. Linguist"},{"key":"B33","doi-asserted-by":"publisher","first-page":"401","DOI":"10.1162\/tacl_a_00107","article-title":"Optimizing statistical machine translation for text simplification","volume":"4","author":"Xu","year":"2016","journal-title":"Trans. Assoc. Comput. Linguist"},{"key":"B34","doi-asserted-by":"publisher","first-page":"7","DOI":"10.1002\/0470011815.b2a15150","article-title":"\u201cSpearman rank correlation,\u201d","author":"Zar","year":"2005","journal-title":"Encyclopedia of Biostatistics"},{"article-title":"\u201cBERTScore: evaluating text generation with BERT,\u201d","year":"2019","author":"Zhang","key":"B35"},{"key":"B36","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-1053","article-title":"\u201cMoverScore: text generation evaluating with contextualized embeddings and earth mover distance,\u201d","author":"Zhao","year":"2019","journal-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing"}],"container-title":["Frontiers in Artificial Intelligence"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/frai.2023.1223924\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,9,22]],"date-time":"2023-09-22T19:09:04Z","timestamp":1695409744000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/frai.2023.1223924\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,9,22]]},"references-count":36,"alternative-id":["10.3389\/frai.2023.1223924"],"URL":"https:\/\/doi.org\/10.3389\/frai.2023.1223924","relation":{},"ISSN":["2624-8212"],"issn-type":[{"type":"electronic","value":"2624-8212"}],"subject":[],"published":{"date-parts":[[2023,9,22]]}}}