Data Science for Digital Humanities 2

Lecturers: Prof. Dr. Goran Glavaš, Lennart Keller

Sessions: Thursdays 10:00 - 12:00 in Sensalight building (John-Skilton-Str. 8a), 4th floor, room 4.23

Kickoff: 27.4.2023

Registration

It is not necessary to register for the sessions.
In order to get access to teaching materials and current announcements you need to register for WueCampus course

To participate in the exam you must register for the exam via WueStudy.
All further information will be shared in our WueCampus course

Objective of the lecture

The course builds on top of the Data Science for DH 1 (from winter semester) and introduces complementary (and some more advanced) DS topics, with emphasis on use cases and application in (computational/digital) humanities.

Tentative schedule/content of the course:

27.4. Session #1: Introduction

Recap of DS4DH1
Course organization

4.5. Session #2: Corpus linguistics

Lexical association measures
Multi-word expressions, collocations, idioms
Lexico-semantic resources: WordNet, BabelNet, PanLex

11.5. Session #3: Topic modeling

Latent Dirichlet Allocation
Practical examples with LDA in Gensim
Homework project #1: pick a corpus, induce topics, analyze topics and topical distribution of documents, prepare a small-scale presentation

25.5. Session #4: Student presentations -- Topic Modeling Homeworks

1.6. Session #5: Networks

Introduction to Graph Theory
Node importance -- degree centrality, closeness centrality, betweeness centrality
Shortest paths
Practical exercises with networkx
Homework project #2: analysis of a large-scale network dataset; prepare a small-scale presentation with insights

15.6. Session #6: Student presentations -- Network Analysis

22.6. Session #7: Evaluation & Statistical Testing

Common evaluation measures for classification and regression
Gold-standard annotation and inter-annotator agreement
Significance testing (parametric: Student’s t-test; non-parametric: Wilcoxon’s test)

29.6. Session #8: Deep Learning

Convolutional NNs
Recurrent NNs
Attention mechanism and Transformers
Practical exercises in keras

6.7. Session #9: Interpretability & Fairness

Explainability and interpretability of machine learning models
Biases and fairness: data bias, model bias

13.7. Session #10: Guest Lecture

A talk by a prominent researcher in the area of Computational Humanities

Hubland Nord, Gebäude 50

Data Science for Digital Humanities 2

Registration

Objective of the lecture

Hinweis zum Datenschutz

Hinweis zum Datenschutz

Bildnachweise