AG 4 – Encoding language and linguistic information in historical corpora

Kerstin Eckart, Universität Stuttgart,
Carolin Odebrecht, Humboldt-Universität zu Berlin,

Historical corpora have been established as an empirical digital base for various types of linguistic studies. Annotations on these corpora have to balance between a diplomatic representation of historical text and its linguistic analysis. This requires a linguistic modelling of annotations to develop annotation guidelines and concepts, as well as assignment methods and corpus architectures. The working group thus focuses on established and new approaches which address these requirements for a structured exploration of historical corpus data.


Donnerstag, 09.03.2017

11:15 – 12:15 Mathilde Hennig
Basic categories in multi layered grammatical annotation
12:15 – 12:45 Svetlana Petrova
Particle verb constructions in historical German and what corpus studies reveal about them
13:45 – 14:15 Lisa Dücker, Stefan Hartmann & Renata Szczepaniak
Annotating a multiregional diachronic corpus of Early New High German handwritten texts

Freitag, 10.03.2017

11:30 – 12:00 Maarten Janssen
TEITOK: Combining language and linguistic information without compromise
12:00 – 12:30 Zarah Weiß & Gohar Schnelle
Annotation of an Early New High German Corpus: The LangBank Pipeline
12:30 – 13:00 Cătălina Mărănduc, Cenel-Augusto Perez, Ludmila Malahov & Alexandru Colesnicov
A diachronic corpus for Romanian (RoDia)
13:00 – 13:30 Katrin Goldschmidt
Development and annotation of a newspaper corpus as part of a doctoral thesis on text structure and cohesion in news items from the 17th and 18th centuries
13:30 – 14:00 Nicoletta Puddu
Encoding sociolinguistic variables in a corpus of Medieval Sardinian texts