Abstract
In this paper we present Uniform Meaning Representation (UMR), a meaning representation designed to annotate the semantic content of a text. UMR is primarily based on Abstract Meaning Representation (AMR), an annotation framework initially designed for English, but also draws from other meaning representations. UMR extends AMR to other languages, particularly morphologically complex, low-resource languages. UMR also adds features to AMR that are critical to semantic interpretation and enhances AMR by proposing a companion document-level representation that captures linguistic phenomena such as coreference as well as temporal and modal dependencies that potentially go beyond sentence boundaries.


Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
As can be seen from this example, the document-level representation is a list of triples in the form of <dependent relation parent>, and deviates from the Penn notation used for the sentence-level representation.
References
Abend O, Rappoport A (2013) UCCA. A semantics-based grammatical annotation scheme. In: Proceedings of the 10th international conference on computational semantics, Potsdam, Germany, pp 1–12
Abzianidze L, Bjerva J, Evang K, Haagsma H, van Noord R, Ludmann P, Nguyen DD, Bos J (2017) The parallel meaning bank: towards a multilingual corpus of translations annotated with compositional meaning representations. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics: volume 2, short papers, Valencia, Spain, pp 242–247
Asher N, Asher NM, Lascarides A (2003) Logics of conversation. Cambridge University Press, Cambridge
Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: Proceedings of the 3rd international conference on learning representations (ICLR 2015)
Baker CF, Fillmore CJ, Lowe JB (1998) The berkeley framenet project. In: Proceedings of the 36th annual meeting of the association for computational linguistics and 17th international conference on computational linguistics-volume 1, association for computational linguistics, pp 86–90
Banarescu L, Bonial C, Cai S, Georgescu M, Griffitt K, Hermjakob U, Knight K, Koehn P, Palmer M, Schneider N (2013) Abstract meaning representation for sembanking. In: Proceedings of the 7th linguistic annotation workshop and interoperability with discourse, pp 178–186
Barker C (2002) Continuations and the nature of quantification. Nat Lang Semant 10(3):211–242
Basile V, Bos J, Evang K, Venhuizen N (2012) Developing a large semantically annotated corpus. LREC 12:3196–3200
Bender EM, Flickinger D, Oepen S, Packard W, Copestake A (2015) Layers of interpretation: on grammar and compositionality. In: Proceedings of the 11th international conference on computational semantics, London, UK, pp 239–249
Bentivogli L, Bisazza A, Cettolo M, Federico M (2016) Neural versus phrase-based machine translation quality: a case study. In: Proceedings of the 2016 conference on empirical methods in natural language processing, Austin, Texas, pp 257–267
Bojar O, Chatterjee R, Federmann C, Graham Y, Haddow B, Huck M, Jimeno Yepes A, Koehn P, Logacheva V, Monz C, Negri M, Neveol A, Neves M, Popel M, Post M, Rubino R, Scarton C, Specia L, Turchi M, Verspoor K, Zampieri M (2016) Findings of the 2016 conference on machine translation. In: Proceedings of the first conference on machine translation, Berlin, pp 131–198
Bos J (2016) Expressive power of abstract meaning representations. Comput Linguist 42(3):527–535
Bos J, Basile V, Evang K, Venhuizen N, Bjerva J (2017) The Groningen Meaning Bank. In: Ide N, Pustejovsky J (eds) Handbook of linguistic annotation, vol 2. Springer, Berlin, pp 463–496
Boye K (2012) Epistemic meaning: a crosslinguistic and functional-cognitive study, Empirical Approaches to Language Typology, vol 43. De Gruyter Mouton, Berlin
Cai D, Lam W (2020) AMR parsing via graph-sequence iterative inference. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, pp 1290–1301. https://doi.org/10.18653/v1/2020.acl-main.119
Cai Q, Yates A (2013a) Large-scale semantic parsing via schema matching and lexicon extension. In: Proceedings of the 51st annual meeting of the association for computational linguistics (volume 1: long papers), vol 1, pp 423–433
Cai Q, Yates A (2013b) Semantic parsing freebase: towards open-domain semantic parsing. In: Second joint conference on lexical and computational semantics (* SEM), volume 1: proceedings of the main conference and the shared task: semantic textual similarity, vol 1, pp 328–338
Castilho S, Moorkens J, Gaspari F, Sennrich R, Sosoni V, Georgakopoulou Y, Lohar P, Way A, Barone AVM, Gialama M (2017) A comparative quality evaluation of PBSMT and NMT using professional translators. In: Proceedings of machine translation summit XVI, Nagoya, Japan
Choe H, Han J, Park H, Kim H (2019) Copula and case-stacking annotations for korean amr. In: Proceedings of the first international workshop on designing meaning representations, pp 128–135
Copestake A, Lascarides A, Flickinger D (2001) An algebra for semantic construction in constraint-based grammars. In: Proceedings of the 39th meeting of the association for computational linguistics, Toulouse, France, pp 140–147
Copestake A, Flickinger D, Pollard C, Sag IA (2005) Minimal recursion semantics: an introduction. Res Lang Comput 3(2–3):281–332
Croft W (2012) Verbs, aspect and causal structure. Oxford University Press, Oxford
Croft W (2013) Agreement as anaphora, anaphora as coreference. Lang Across Bound Stud Mem Anna Siewierska 95:117
Croft W, Pešková P, Regan M (2017) Integrating decompositional event structures into storylines. In: Proceedings of the events and stories in the news workshop, association for computational linguistics, Vancouver, Canada, pp 98–109. https://doi.org/10.18653/v1/W17-2713
Cysouw M (2003) The paradigmatic structure of person marking. Oxford University Press, Oxford
Dixon RM, Aikhenvald AY, et al. (2002) Word: A cross-linguistic typology, chap Word: a typological framework, pp 1–41
Donatelli L, Regan M, Croft W, Schneider N (2018) Annotation of tense and aspect semantics for sentential AMR. In: Proceedings of the joint workshop on linguistic annotation, multiword expressions and constructions (LAW-MWE-CxG-2018), pp 96–108
Donatelli L, Schneider N, Croft W, Regan M (2019) Tense and aspect semantics for sentential AMR. Proc Soc Comput Linguist 2:346–348
Dorr BJ (1993) Machine translation: a view from the Lexicon. MIT Press, Chicago
Dorr BJ (1994) Machine translation divergences: a formal description and proposed solution. Comput Linguist 20(4):597–633
Eriguchi A, Tsuruoka Y, Cho K (2017) Learning to parse and translate improves neural machine translation. In: Proceedings of the 55th annual meeting of the association for computational linguistics (volume 2: Short Papers), Vancouver, Canada, pp 72–78
Flanigan J, Thomson S, Carbonell JG, Dyer C, Smith NA (2014) A discriminative graph-based parser for the abstract meaning representation. In: Proceedings of the 52nd annual meeting of the association for computational linguistics (volume 1: long papers), pp 1426–1436
Flickinger D (2000) On building a more efficient grammar by exploiting types. Nat Lang Eng 6(1):15–28
Flickinger D, Bender EM, Oepen S (2014) Towards an encyclopedia of compositional semantics: documenting the interface of the English resource grammar. In: LREC, pp 875–881
Ge R, Mooney RJ (2005) A statistical semantic parser that integrates syntax and semantics. In: Proceedings of the ninth conference on computational natural language learning, pp 9–16
Hajič J, Panevová J, Urešová Z, Bémová A, Kolářová V, Pajas P (2003) PDT-VALLEX: creating a large-coverage valency lexicon for treebank annotation. In: Nivre J, Hinrichs E (eds) Proceedings of the second workshop on treebanks and linguistic theories, vol 9. Vaxjo University Press, Vaxjo, pp 57–68
Hajič J, Panevová J, Hajičová E, Sgall P, Pajas P, Štěpánek J, Havelka J, Mikulová M, Žabokrtský Z, Ševčíková-Razímová M, Urešová Z (2006) Prague Dependency Treebank 2.0 (PDT 2.0). http://hdl.handle.net/11858/00-097C-0000-0001-B098-5
Hajič J, Bejček E, Bémová A, Buráňová E, Hajičová E, Havelka J, Homola P, Kárník J, Kettnerová V, Klyueva N, Kolářová V, Kučová L, Lopatková M, Mikulová M, Mírovský J, Nedoluzhko A, Pajas P, Panevová J, Poláková L, Rysová M, Sgall P, Spoustová J, Straňák P, Synková P, Ševčíková M, Štěpánek J, Urešová Z, Vidová Hladká B, Zeman D, Zikánová Š, Žabokrtský Z (2018) Prague dependency treebank 3.5. http://hdl.handle.net/11234/1-2621
Hajič J, Hajičová E, Panevová J, Sgall P, Bojar O, Cinková S, Fučíková E, Mikulová M, Pajas P, Popelka J, Semecký J, Šindlerová J, Štěpánek J, Toman J, Urešová Z, Žabokrtský Z (2012) Announcing Prague Czech-English Dependency Treebank 2.0. In: Proceedings of the eighth international conference on Language Resources and Evaluation, Istanbul, pp 3153–3160
Haspelmath M (2013) Argument indexing: a conceptual framework for the syntactic status of bound person forms. Lang Across Bound Stud Mem Anna Siewierska 197:226
Haspelmath M, Hartmann I (2015) Comparing verbal valency across languages. Valency Class World’s Lang 1:41–72
Helmreich S, Farwell D, Dorr B, Habash N, Levin L, Mitamura T, Reeder F, Miller K, Hovy E, Rambow O et al (2004) Interlingual annotation of multilingual text corpora. In: Proceedings of the HLT-EACL Workshop on Frontiers in Corpus Annotation
Hershcovich D, Aizenbud Z, Choshen L, Sulem E, Rappoport A, Abend O (2019) SemEval-2019 task 1: cross-lingual semantic parsing with UCCA. In: Proceedings of the 13th international workshop on semantic evaluation, Minneapolis, Minnesota, USA, pp 1–10. https://doi.org/10.18653/v1/S19-2001
Kamp H, Reyle U (1993) From discourse to logic: introduction to model theoretic semantics of natural language, formal logic and discourse representation theory. Kluwer, Dordrecht
Kamp H, Reyle U (2013) From discourse to logic: introduction to model-theoretic semantics of natural language, formal logic and discourse representation theory, vol 42. Springer, Berlin
Kate RJ, Mooney RJ (2007) Learning language semantics from ambiguous supervision. AAAI 7:895–900
Li B, Wen Y, QU W, Bu L, Xue N (2016) Annotating The Little Prince with Chinese AMRs. In: Proceedings of the 10th linguistic annotation Workshop held in conjunction with ACL 2016 (LAW-X 2016), association for computational linguistics, pp 7–15. https://doi.org/10.18653/v1/W16-1702
Li B, Wen Y, Song L, Qu W, Xue N (2019) Building a Chinese AMR bank with concept and relation alignments. LiLT (Linguistic Issues in Language Technology) 18
Liang P, Jordan MI, Klein D (2009) Learning semantic correspondences with less supervision. In: Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP: Volume 1, pp 91–99
Lyu C, Titov I (2018) AMR parsing as graph prediction with latent alignment. In: Proceedings of the 56th annual meeting of the association for computational linguistics (volume 1: long papers), Melbourne, Australia, pp 397–407. https://doi.org/10.18653/v1/P18-1037
Marimon M (2010) The Spanish resource grammar. In: Chair) NCC, Choukri K, Maegaard B, Mariani J, Odijk J, Piperidis S, Rosner M, Tapias D (eds) Proceedings of the seventh international conference on language resources and evaluation (LREC’10), European Language Resources Association (ELRA), Valletta, Malta
Mikulová M, Bémová A, Hajič J, Hajičová E, Havelka J, Kolářová V, Kučová L, Lopatková M, Pajas P, Panevová J, Razímová M, Sgall P, Štěpánek J, Urešová Z, Veselá K, Žabokrtský Z (2006) Annotation on the tectogrammatical level in the Prague Dependency Treebank. Annotation manual. Tech. Rep. 30, Institute of Formal and Applied Linguistics, Charles Univ., Prague, Czech Rep
Miller GA (1998) WordNet: an electronic lexical database. MIT Press, Chicago
Mithun M (1984) The evolution of noun incorporation. Language 60:847–94
Mithun M (2015) Morphological complexity and language contact in languages indigenous to North America. Linguist Discov 13(2):37–59
Nyberg EH, Mitamura T (1992) The KANT system: Fast, accurate, high-quality translation in practical domains. In: Proceedings of the 14th conference on Computational linguistics-Volume 3, pp 1069–1073
Oepen S, Flickinger D, Toutanova K, Manning CD (2004) Lingo redwoods. Res Lang Comput 2(4):575–596
O’Gorman T, Regan M, Griffitt K, Hermjakob U, Knight K, Palmer M (2018) AMR beyond the sentence: the multi-sentence AMR corpus. In: Proceedings of the 27th international conference on computational linguistics, pp 3693–3702
Palmer M, Gildea D, Kingsbury P (2005) The proposition bank: an annotated corpus of semantic roles. Comput Linguist 31(1):71–106
Prange J, Schneider N, Abend O (2019) Semantically constrained multilayer annotation: The case of coreference. In: Proceedings of the First International Workshop on Designing Meaning Representations, Florence, Italy, pp 164–176. https://doi.org/10.18653/v1/W19-3319
Pustejovsky J, Castano JM, Ingria R, Sauri R, Gaizauskas RJ, Setzer A, Katz G, Radev DR (2003) TimeML: robust specification of event and temporal expressions in text. New Direct Quest Answer 3:28–34
Pustejovsky J, Xue N, Lai K (2019) Modeling quantification and scope in abstract meaning representations. In: Proceedings of the first international workshop on designing meaning representations, pp 28–33
Rajpurkar P, Zhang J, Lopyrev K, Liang P (2016) SQuAD: 100,000+ questions for machine comprehension of text. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, Texas, pp 2383–2392. https://doi.org/10.18653/v1/D14
Reddy S, Täckström O, Petrov S, Steedman M, Lapata M (2017) Universal semantic parsing. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, pp 89–101. https://doi.org/10.18653/v1/D17-1009
Sag IA, Wasow T, Bender EM, Sag IA (1999) Syntactic theory: a formal introduction, vol 92. Center for the Study of Language and Information, Stanford
Sag IA, Baldwin T, Bond F, Copestake A, Flickinger D (2002) Multiword expressions: A pain in the neck for NLP. In: International conference on intelligent text processing and computational linguistics, Springer, pp 1–15
Saurí R, Pustejovsky J (2009) FactBank: a corpus annotated with event factuality. Lang Resour Evaluat 43:227–268
Schuler KK (2005) Verbnet: a broad-coverage, comprehensive verb lexicon. PhD thesis, University of Pennsylvania
Smith N (2017) Squashing computational linguistics, invited talk at the 55th acl. https://homes.cs.washington.edu/~nasmith/slides/acl-8-1-17.pdf
Stassen L (1997) Intransitive predication. Oxford University Press, Oxford, UK
Stassen L (2009) Predicative possession. Oxford University Press, Oxford, UK
Steedman M (2000) The syntactic process, vol 24. MIT Press, Cambridge
Urešová Z, Fučíková E, Šindlerová J (2016) CzEngVallex: a bilingual Czech-English valency lexicon. The Prague Bulletin of Mathematical Linguistics 105:17–50
Urešová Z, Fučíková E, Hajičová E, Hajič J (2020) Synsemclass linked lexicon: Mapping synonymy between languages. In: Proceedings of the 2020 Globalex Workshop on Linked Lexicography (LREC 2020), European Language Resources Association, Marseille, France, pp 10–19
Van Gysel JEL, Vigus M, Kalm P, Lee Sk, Regan M, Croft W (2019) Cross-linguistic semantic annotation: Reconciling the language-specific and the universal. In: Proceedings of the First International Workshop on Designing Meaning Representations, Florence, Italy, pp 1–14. https://doi.org/10.18653/v1/W19-3301
Vendler Z (1967) Linguistics in philosophy. Cornell University Press, Ithaca, chap Verbs and times, pp 97–121
Vigus M, Van Gysel JE, Croft W (2019) A dependency structure annotation for modality. In: Proceedings of the First International Workshop on Designing Meaning Representations, pp 182–198
Vigus M, Van Gysel JEL, O’Gorman T, Cowell A, Vallejos R, Croft W (2020) Cross-lingual annotation: a road map for low- and no-resource languages. In: Proceedings of the Second International Workshop on Designing Meaning Representations, Barcelona Spain (online), pp 30–40, https://www.aclweb.org/anthology/2020.dmr-1.4
Wang C, Xue N, Pradhan S (2015) A transition-based algorithm for AMR parsing. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 366–375
Wang Z, Mi H, Hamza W, Florian R (2016) Multi-perspective context matching for machine comprehension. arXiv preprint arXiv:161204211
Wong YW, Mooney RJ (2006) Learning for semantic parsing with statistical machine translation. In: Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, pp 439–446
Wu S, Zhang D, Yang N, Li M, Zhou M (2017) Sequence-to-dependency neural machine translation. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vancouver, Canada, pp 698–707
Xiong C, Zhong V, Socher R (2016) Dynamic coattention networks for question answering. arXiv preprint arXiv:161101604
Xue N, Zhong H, Chen KY (2008) Annotating ‘tense’ in a tense-less language. In: LREC, Marrakech, Morocco
Xue N, Bojar O, Hajič J, Palmer M, Urešová Z, Zhang X (2014) Not an interlingua, but close: Comparison of English AMRs to Chinese and Czech. In: Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC)
Yao J, Qiu H, Min B, Xue N (2020) Annotating temporal dependency graphs via crowdsourcing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp 5368–5380
Zhang S, Ma X, Duh K, Van Durme B (2019) AMR parsing as sequence-to-graph transduction. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, pp 80–94. https://doi.org/10.18653/v1/P19-1009
Zhang Y, Xue N (2018) Structured interpretation of temporal relations. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), European Language Resources Association (ELRA), Miyazaki, Japan
Zingler T (2020) Wordhood issues: typology and grammaticalization. PhD thesis, University of New Mexico
Author information
Authors and Affiliations
Corresponding author
Additional information
This work is supported in part by a grant from the IIS Division of National Science Foundation (Awards Nos. 1763926, 1764048, 1764091) entitled “Building a Uniform Meaning Representation for Natural Language Processing” awarded to Nianwen Xue, James Pustejovsky, Martha Palmer and William Croft. All views expressed in this paper are those of the authors and do not necessarily represent the view of the National Science Foundation. This work is supported in part by a grant from the Ministry of Education, Youth and Sports of the Czech Republic (Project No. LM2018101) and in part by the Czech Science Foundation (Award No. GX20-16819X).
Rights and permissions
About this article
Cite this article
Van Gysel, J.E.L., Vigus, M., Chun, J. et al. Designing a Uniform Meaning Representation for Natural Language Processing. Künstl Intell 35, 343–360 (2021). https://doi.org/10.1007/s13218-021-00722-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13218-021-00722-w