ICE Scotland
In this project, we collect a 1-million-word corpus of spoken and written 21st century Scottish English. The corpus contains the text categories and annotations specified by the ICE project plus a high number of additional linguistic annotations such as part-of-speech and phonetic transcriptions. The corpus is available in an XML-format and can be downloaded here. Download of sound files.
Corpus annotation is carried out with Pacx - Platform for Annotated Corpora. The corpus creation process is agile, which means query-driven, based on a cyclic processing model and following the minimal effort principle (see Voormann & Gut 2008).
Project members:
- Robert Fuchs
- Ulrike Gut
- Elvira Hadzic
- Ole Schützler
- Jennifer Smith
- Laura Sollgan
- Silke Stagg
- Holger Voormann
- Sarah-Loana Weiß
- Daniel Zerner