ICE Scotland

In this project, we collect a 1-million-word corpus of spoken and written 21st century Scottish English. The corpus contains the text categories and annotations specified by the ICE project plus a high number of additional linguistic annotations such as part-of-speech and phonetic transcriptions. The corpus is available in an XML-format and can be downloaded here. Download of sound files.

Corpus annotation is carried out with Pacx - Platform for Annotated Corpora. The corpus creation process is agile, which means query-driven, based on a cyclic processing model and following the minimal effort principle (see Voormann & Gut 2008).

Project members:

Robert Fuchs
Ulrike Gut
Elvira Hadzic
Ole Schützler
Jennifer Smith
Laura Sollgan
Silke Stagg
Holger Voormann
Sarah-Loana Weiß
Daniel Zerner