Zum Profil
Clausal Coordinate Ellipsis (CCE) is a challenging linguistic phenomenon where in a coordinated sentence at least one constituent in the second conjunct and/or one word in the first constituent, respectively, can be omitted. For the list of phenomena covered by CCE in German, see slides 4 and 5 in Memmesheimer and Harbusch (2023) (slide 3 illustrates the case that different CCE phenomena occur at the same time).
We work on different aspects of CCE (presented in their historical order; all considerations in the following are based on the psycholinguistic perspective to CCE phenomena as described in Kempen (2009)):
- Automatic CCE generation (cf. the system ELLEIPO — written in JAVA — that originally worked for Dutch and German) based on syntactic rules; currently, it works for Dutch, Estonian, German, Hungarian, Polish, and Russian). ELLEIPO can be used in different environments:
- Component in a natural-language generator based on the stages "Conceptualization", "Aggregation", and "Formulation" (cf. Harbusch and Kempen (2009)).
- Interactiv trainer to teach ellipsis generation to learners of the target languages Dutch, Estonian, German, Hungarian, and Russian (cf. Harbusch, Krusko, and Kempen (2016); watch a version of the system at work; cf. the presentation at EAB 2016).
- Bulding the test corpus for interviews with native speakers where they rate whether or not the CCE-generation rules work in a new target language. For this purpose, translate the test sentences in one of the currently existing target languages, run ELLEIPO as batch to produce all combinations of CCE phenomena in the new target language (cf. Harbusch and Memmesheimer (2018) and Section 5 in Harbusch et al. (2018)).
- Treebank studies evaluating the accuracy of our syntactic approach to CCE generation in German and Dutch:
- Clausal coordinate ellipsis in German: The TIGER treebank as a source of evidence.
Karin Harbusch and Gerard Kempen.
Proceedings of NODALIDA 2007 - Sixteenth Nordic Conference of Computational Linguistics, Tartu, Estonia, 2007. - Clausal Coordinate Ellipsis and its Varieties in Spoken German: A Study with the TüBa-D/S Treebank of the VERBMOBIL Corpus.
Karin Harbusch and Gerard Kempen.
Proceedings of TLT8 - 8th International Workshop on Treebanks and Linguistic Theories, Milano, Italy, 2009.
Incremental sentence production inhibits clausal coordinate ellipsis: A treebank study into Dutch and German.
Karin Harbusch.
David Schlangen and Hannes Rieser (eds.). Dialogue and Discourse, Vol. 2, No. 1, 2011, pp. 313-332, 2011.
- Clausal coordinate ellipsis in German: The TIGER treebank as a source of evidence.
- Adapting the syntax-based CCE-generation rules to new target languages:
- Parsing based on an inversion of the CCE-generation rules:
- OPIELLE reverses the procedures of ELLEIPO; it expands the chart produced by a probabistic context-free parser; for any input sentence (coordinated or not), it produces the so-called canonical form (i.e., reconstructing elided elements-in case there are any), and
- a German Parallel Clausal Coordinate Ellipsis Corpus developped for the evaluation of OPIELLE. The corpus is available upon request to Karin Harbusch.
Förderungen & Partnerschaften
Beteiligte Einrichtungen