Seminar: Text Modelling

Overview

Text modelling is the basis for many applications in the fields of natural language processing and information retrieval. This seminar provides an introduction into a handful of basic methods for text modelling.

This seminar will be held in English.

For more information, please register in the Learnweb course, once it exists.

Requirements

Research a given topic with an academic publication as starting point
- Depending on the number of participants, alone or in teams
- Literature search for related work or further developments
Present two talks, each around 30 minutes (plus discussion)
- First talk on basics and first publication
- Second talk on a further development based on one or two papers (from the literature search)
Compile an invidiual written report
- Around 8 pages in the ijcai format (double column) without references
- Description of the concepts from the talks
- Including results of literature search
Attendance during all presentations, participation during discussions

Topics

Eigenvalue-based representation:
Drikvandi & Lawal (2020): Sparse Principal Component Analysis for Natural Language Processing
Probabilistic modelling:
Blei et al. (2003): Latent Dirichlet Allocation
Word embeddings:
Mikolov et al. (2013): Distributed Representations of Words and Phrases and Their Compositionality
Language model:
Devlin et al. (2019): BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

The corresponding source articles can be found by searching for the titles and authors mentioned above. Possibly, you need to be in the WWU network for access.