Data Science for Digital Humanities 2
Lecturers: Prof. Dr. Goran Glavaš, Lennart Keller
Sessions: Thursdays 10:00 - 12:00 in Sensalight building (John-Skilton-Str. 8a), 4th floor, room 4.23
Kickoff: 27.4.2023
Registration
It is not necessary to register for the sessions.
In order to get access to teaching materials and current announcements you need to register for WueCampus course
To participate in the exam you must register for the exam via WueStudy.
All further information will be shared in our WueCampus course
Objective of the lecture
The course builds on top of the Data Science for DH 1 (from winter semester) and introduces complementary (and some more advanced) DS topics, with emphasis on use cases and application in (computational/digital) humanities.
Tentative schedule/content of the course:
27.4. Session #1: Introduction
- Recap of DS4DH1
- Course organization
4.5. Session #2: Corpus linguistics
- Lexical association measures
- Multi-word expressions, collocations, idioms
- Lexico-semantic resources: WordNet, BabelNet, PanLex
11.5. Session #3: Topic modeling
- Latent Dirichlet Allocation
- Practical examples with LDA in Gensim
- Homework project #1: pick a corpus, induce topics, analyze topics and topical distribution of documents, prepare a small-scale presentation
25.5. Session #4: Student presentations -- Topic Modeling Homeworks
1.6. Session #5: Networks
- Introduction to Graph Theory
- Node importance -- degree centrality, closeness centrality, betweeness centrality
- Shortest paths
- Practical exercises with networkx
- Homework project #2: analysis of a large-scale network dataset; prepare a small-scale presentation with insights
15.6. Session #6: Student presentations -- Network Analysis
22.6. Session #7: Evaluation & Statistical Testing
- Common evaluation measures for classification and regression
- Gold-standard annotation and inter-annotator agreement
- Significance testing (parametric: Student’s t-test; non-parametric: Wilcoxon’s test)
29.6. Session #8: Deep Learning
- Convolutional NNs
- Recurrent NNs
- Attention mechanism and Transformers
- Practical exercises in keras
6.7. Session #9: Interpretability & Fairness
- Explainability and interpretability of machine learning models
- Biases and fairness: data bias, model bias
13.7. Session #10: Guest Lecture
- A talk by a prominent researcher in the area of Computational Humanities