ICE Scotland is a 1-million-word corpus of spoken and written 21st century Scottish English. It contains the text categories and annotations specified by the ICE project plus a high number of additional linguistic annotations such as part-of-speech tagging and phonemic transcriptions. The corpus is available in an XML-format.
The first parts of the corpus were released in April 2020 and can be downloaded here.
For the corpus compilation we used Pacx - Platform for Annotated Corpora.