Abstract
We present a method for improving local coherence in German with a positive effect on automatically as well as human-generated texts. We demonstrate that local coherence crucially depends on which constituent occupies the initial position in a sentence. To support our hypothesis, we provide statistical evidence based on a corpus investigation and on results of an experiment with human judges. Additionally, we implement our findings in a generation module for determining the Vorfeld constituent automatically.
Similar content being viewed by others
References
Barzilay R., Elhadad N., McKeown K.R. (2002). Inferring strategies for sentence ordering. Journal of Artificial Intelligence Research 17: 35–55
Berger A., Della Pietra S.A., Della Pietra V.J. (1996). A maximum entropy approach to natural language processing. Computational Linguistics 22(1): 39–71
Brants, T. (2000). TnT–A statistical Part-of-Speech tagger. In Proceedings of the 6th Conference on Applied Natural Language Processing, Seattle, Wash., 29 April – 4 May 2000. pp. 224–231.
Chafe W. (1976). Givenness, contrastiveness, definiteness, subjects, topics, and point of view. In: Li C (eds) Subject and topic. New York NY, Academic Press, pp. 25–55
Chafe W. (1987). Cognitive constraints on information flow. In: Tomlin R.S. (eds) Coherence and grounding in discourse. Amsterdam, The Netherlands, John Benjamins, pp. 21–52
Drach E. (1937). Grundlagen der deutschen Satzlehre. Frankfurt/Main, Germany, Diesterweg
Filippova, K. & Strube, M. (2007). Generating constituent order in German clauses. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, Prague, Czech Republic, 23–30 June 2007. (To appear).
Firbas, J. (1974). Some aspects of the Czechoslovak approach to problems of functional sentence prespective. In Danes, F. (Ed.), Papers on functional sentence perspective. (pp. 11–37) Prague: Academia.
Foth, K., & Menzel, W. (2006). Hybrid parsing: Using probabilistic models as predictors for a symbolic parser. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia, 17–21 July 2006, pp. 321–327.
Frey W. (2004). A medial topic position for German. Linguistische Berichte 198: 153–190
Gernsbacher M. A., Hargreaves D.J. (1988). Accessing sentence participants: The advantage of first mention. Journal of Memory and Language 27: 699–717
Givon, T. (1983). Topic continuity in spoken English. In Givon, T. (Ed.), Topic continuity in discourse: A quantitative cross-language study. Amsterdam, The Netherlands: John Benjamins.
Graesser A.C., Singer M., Trabasso T. (1994). Constructing inferences during narrative text comprehension. Psychological Review 101(3): 371–395
Grosz B.J., Joshi A.K., Weinstein S. (1995). Centering: A framework for modeling the local coherence of discourse. Computational Linguistics 21(2): 203–225
Gundel, J. K. (1998). Centering theory and the givenness hierarchy: Towards a synthesis. In Walker, M., Joshi, A., & Prince, E. (Eds.), Centering in discourse (pp. 183–198). Oxford, U.K.: Oxford University Press.
Gundel J.K., Hedberg N., Zacharski R. (1993). Cognitive status and the form of referring expressions in discourse. Language 69: 274–307
Hajičová E., Skoumalová H., Sgall P. (1995). An automatic procedure for topic-focus identification. Computational Linguistics 21(1): 81–94
Halliday M.A.K. (1985). Introduction to functional grammar. London, UK, Arnold
Hockett C.F. (1958). A course in modern linguistics. New York, NY, Macmillan
Hoffman, B. (1998). Word order, information structure, and centering in Turkish. In Walker, M., Joshi, A., & Prince, E. (Eds.), Centering theory in discourse (pp. 251–271) Oxford, U.K.: Oxford University Press.
Jacobs J. (2001). The dimensions of topic-comment. Linguistics 39(4): 641–681
Karamanis, N., Poesio, M. Mellish, C. & Oberlander, J.: 2004, ‘Evaluating centering-based metrics of coherence for text structuring using a reliably annotated corpus. In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, Barcelona, Spain, 21–26 July 2004, pp. 392–393.
Kempen, G., & Harbusch, K. (2004). How flexible is constituent order in the midfield of German subordinate clauses? A corpus study revealing unexpected rigidity. In Proceedings of the International Conference on Linguistic Evidence, Tübingen, Germany, 29–31 January 2004, pp. 81–85.
Kohavi, R., & Sahami, M. (1996). Error-based and entropy-based discretization of continuous features. In Proceedings of the 2nd International Conference on Data Mining and Knowledge Discovery, Portland, Oreg., 2–4 August, 1996, pp. 114–119.
Kruijff, G.-J., Kruijff-Korbayová, I., Bateman, J., & Teich, E. (2001). Linear order as higher-level decision: Information structure in strategic and tactical generation. In Proceedings of the 8th European Workshop on Natural Language Generation, Toulouse, France, 6–7 July 2001, pp. 74–83.
Kruijff-Korbayová, I., Kruijff, G.-J., & Bateman, J. (2002). Generation of appropriate word order. In van Deemter, K., & Kibble, R. (Eds.), Information sharing: Reference and presupposition in language generation and interpretation (pp. 193–222). Stanford, Cal.: CSLI.
Lambrecht K. (1994). Information structure and sentence form. Cambridge, UK, Cambridge University Press
Lapata, M. (2003). Probabilistic text structuring: Experiments with sentence ordering. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, Sapporo, Japan, 7–12 July 2003, pp. 545–552.
Levinson S.C. (1983). Pragmatics. Cambridge, UK, Cambridge University Press
McNamara D.S., Kintsch E., Songer N.B., Kintsch W. (1996). Are good texts always better? Interactions of text coherence, background knowledge, and levels of understanding in learning from text. Cognition and Instruction 14(1): 1–43
Molnár, V. (1991). Das TOPIK im Deutschen und im Ungarischen. Stockholm, Sweden: Almqvist and Wiksell.
Poesio M., Stevenson R., Di Eugenio B., Hitzeman J. (2004). Centering: A parametric theory and its instantiations. Computational Linguistics 30(3): 309–363
Postolache, O., Kruijff-Korbayová, I., & Kruijff, G.-J. (2005). Data-driven approaches for information structure identification. In Proceedings of the Human Language Technology Conference and the 2005 Conference on Empirical Methods in Natural Language Processing, Vancouver, B.C., Canada, 6–8 October 2005, pp. 9–16.
Prince, E. F. (1981). Towards a taxonomy of given-new information. In Cole, P. (Ed.) Radical pragmatics (pp. 223–255). New York, NY: Academic Press.
Prince, E. F. (1999). How not to mark topics: ‘Topicalization’ in English and Yiddish. Texas Linguistics Forum.
Rambow, O. (1993). Pragmatic aspects of scrambling and topicalization in German. In Workshop on Centering Theory in Naturally-Occurring Discourse. Institute for Research in Cognitive Science (IRCS), Philadelphia, Penn.: University of Pennsylvania, May 1993.
Ratnaparkhi, A. (2000). Trainable methods for surface natural language generation. In Proceedings of the 1st Conference of the North American Chapter of the Association for Computational Linguistics, Seattle, Wash., 29 April–3 May, 2000, pp. 194–201.
Reinhart T. (1981). Pragmatics and linguistics. An analysis of sentence topics. Philosphica 27(1): 53–94
Ringger, E., Gamon, M., Moore, R. C., Rojas, D., Smets, M., & Corston-Oliver, S. (2004). Linguistically informed statistical models of constituent structure for ordering in sentence realization. In: Proceedings of the 20th International Conference on Computational Linguistics, Geneva, Switzerland, 23–27 August 2004, pp. 673–679.
Schmid, H. (1997). Probabilistic part-of-speech tagging using decision trees. In Jones, D., & Somers, H. (Eds.), New methods in language processing. (pp. 154–164). London, UK: UCL Press.
Sgall, P., Hajicová, E., & Panevová, J. (1986). The meaning of the sentence in its semantic and pragmatic aspects. Dordrecht, The Netherlands: D. Reidel.
Strawson P.F. (1964). Identifying reference and truth-values. In: Steinberg D., Jacobovits L. (eds) Semantics. Cambridge, UK, Cambridge University Press, pp. 86–99
Strube, M., & Ponzetto, S. P. (2006). WikiRelate! computing semantic relatedness using Wikipedia. In Proceedings of the 21st National Conference on Artificial Intelligence, Boston, Mass., 16–20 July 2006, pp. 1419–1424.
Uchimoto, K., Murata, M., Ma, Q., Sekine, S., & Isahara, H. (2000) Word order acquisition from corpora. In Proceedings of the 18th International Conference on Computational Linguistics, Saarbrücken, Germany, 31 July – 4 August 2000, pp. 871–877.
Vallduví E., Engdahl E. (1996). The linguistic realization of information packaging. Linguistics 34: 459–519
Webber B.L., Stone M., Joshi A., Knott A. (2003). Anaphora and discourse structure. Computational Linguistics 29(4): 545–588
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Filippova, K., Strube, M. The German Vorfeld and Local Coherence. J of Log Lang and Inf 16, 465–485 (2007). https://doi.org/10.1007/s10849-007-9044-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10849-007-9044-3