Data Science Chair


    We have successfully acquired funding for a three year DFG project in cooperation with the Chair for Computer Philology and History of Contemporary German Literature (Prof. Fotis Jannidis). The goal of the project is to improve the capabilities of Language Models to model characters in literary texts - an area where even current models often still struggle to provide a good representation of the text's content, the story.

    We're looking forward to working on this project! If you're interested in joining us, you can always contact Prof. Dr. Andreas Hotho (dmir-jobs@uni-wuerzburg.de).

    Project Overview

    LitBERT is a collaboration between computer science and computational literary studies (CLS), an emergent field which analyses larger collections of literary texts using a wide set of tools from computational linguistics, computer science and its own tradition. In our project, we focus on the computational literary analysis of character as one of the most important descriptors of narrative and dramatic texts.
    Our project will investigate the textual description of character's internal and external features, actions and further character specific information using knowledge induced language models.
    We aim to create a character knowledge graph through extracting character information from text, find different character types through data-driven clustering, and leverage this information to develop a character-attentive, "literary" language model ("LitBERT") for automatic literary analysis.
    The project will significantly advance the state of the art in the combination of language models and knowledge graphs, showing how to improve the performance of language models for the analysis of entities and their attributes by (a) integrating knowledge graphs and (b) enriching domain specific knowledge graphs based on text analysis using language models. Additionally we want to improve the handling of longer texts like novels by advancing the capabilities of language models to represent knowledge, like representation and types of characters in the text world (i.e., the world described in the text).


    The following persons are involved in the project: