The Data Science Chair (computer science institute JMU) is a highly interdisciplinary group of researchers led by Prof. Andreas Hotho and member of CAIDAS. We are active in data science and machine learning research with focus on deep learning and applications in multiple domains.
Address
University of Würzburg Data Science Chair (Computer Science X) Emil-Fischer-Straße 50 97074 Würzburg
ModeConv: A Novel Convolution for Distinguishing Anomalous and Normal Structural BehaviorM. Schaller; D. Schlör; A. Hotho in ACM Trans. Sen. Netw. (2026).
New Encoders for German Trained from Scratch: Comparing ModernGBERT with Converted LLM2Vec ModelsJ. Wunderle; A. Ehrmanntraut; J. Pfister; F. Jannidis; A. Hotho (2026). 10424–10446.
Towards Knowledge Graph-Grounded Evaluation of Agentic LLMs on Cybersecurity Capture-the-Flag ChallengesD. Schlör; M. Bohn; M. Wolf; K. Bergner; C. Goldschmied; A. Hotho in 15th edition of the Language Resources and Evaluation Conference, KG & LLM @ LREC (2026).
Parameter Efficient Continual Automated Knowledge Graph CompletionJ. Omeliyanenko; A. Hotho; D. Schlör in ESWC 2026 - 23rd European Semantic Web Conference (2026).
Rethinking Synthetic Oversampling for Intrusion Detection: When Similarity Hurts PerformanceM. Wolf; D. Landes; A. Hotho; D. Schlör in CISIS 2026 - 19th International Conference on Computational Intelligence in Security for Information Systems (2026).
TaylorNet: Learning PDEs from Non-Grid DataA. Dulny; P. Heinisch; A. Hotho; A. Krause in Proceedings of Machine Learning Research, C. Coelho, B. Zimmering, M. F. P. Costa, L. L. Ferrás, O. Niggemann (Eds.), (2025). (Vol. 277) 26–46.
Die SuperGLEBer at GermEval 2025 Shared Tasks: Growing Pains - When More Isn’t Always BetterJ. Wunderle; J. Pfister; A. Hotho C. Wartena, U. Heid (Eds.), (2025). 479–493.
BARTABSA++: Revisiting BARTABSA with Decoder LLMsJ. Pfister; T. Völker; A. Vlasjuk; A. Hotho H. Fei, K. Tu, Y. Zhang, X. Hu, W. Han, Z. Jia, Z. Zheng, Y. Cao, M. Zhang, W. Lu, N. Siddharth, L. Ovrelid, N. Xue, Y. Zhang (Eds.), (2025). 115–128.
SALT at SemEval-2025 Task 2: A SQL-based Approach for LLM-Free Entity-Aware-TranslationT. Völker; J. Pfister; A. Hotho S. Rosenthal, A. Rosá, D. Ghosh, M. Zampieri (Eds.), (2025). 852–864.
LLäMmlein: Transparent, Compact and Competitive German-Only Language Models from ScratchJ. Pfister; J. Wunderle; A. Hotho W. Che, J. Nabende, E. Shutova, M. T. Pilehvar (Eds.), (2025). 2227–2246.
CAIDAS at SemEval-2025 Task 7: Enriching Sparse Datasets with LLM-Generated Content for Improved Information RetrievalD. Benchert; S. Meßlinger; S. Goller; J. Kaiser; J. Pfister; A. Hotho S. Rosenthal, A. Rosá, D. Ghosh, M. Zampieri (Eds.), (2025). 1623–1638.
SALT at SemEval-2025 Task 2: A SQL-based Approach for LLM-Free Entity-Aware-TranslationT. V"olker; J. Pfister; A. Hotho S. Rosenthal, A. Rosá, D. Ghosh, M. Zampieri (Eds.), (2025). 852–864.
Modeling and Analyzing the Influence of Non-Item Pages on Sequential Next-Item PredictionE. Fischer; A. Zehe; A. Hotho; D. Schlör in ACM Trans. Recomm. Syst. (2025).
Enriching Large Language Models with Knowledge Graphs for Computational Literary StudiesT. Hagen; J. Omeliyanenko; A. Ehrmanntraut; A. Hotho; A. Zehe; F. Jannidis in Handbook on Neurosymbolic AI and Knowledge Graphs (2025).
PreAdapter: Pre-training Language Models on Knowledge GraphsJ. Omeliyanenko; A. Hotho; D. Schlör G. Demartini, K. Hose, M. Acosta, M. Palmonari, G. Cheng, H. Skaf-Molli, N. Ferranti, D. Hernández, A. Hogan (Eds.), (2025). 210–226.
How do you build scalable, transparent language models for the German language entirely from scratch? We had the opportunity to discuss exactly that at the DLR in Ulm. The focus was on our model families LLäMmlein (120M–7B) and ModernGBERT (138M–1B), as well as the unique challenges of purely German tokenization.
From Brisbane with Insights: How "tame" is our German LLM family really? Our chair returns from a six-week research sabbatical at The University of Queensland with new findings on AI safety and "LLäMmlein".