Ground Control to Major LLäMmlein: Unsere deutschen LLMs zu Gast beim DLR
26.03.2026How do you build scalable, transparent language models for the German language entirely from scratch? We had the opportunity to discuss exactly that at the DLR in Ulm. The focus was on our model families LLäMmlein (120M–7B) and ModernGBERT (138M–1B), as well as the unique challenges of purely German tokenization.
On March 19, we delivered a keynote titled "LLäMmlein and ModernGBERT: A New German LLM Family in Research and Application" at the Wissenaustauschworkshop at the DLR in Ulm.
We had a great time and want to send a huge thank you to the organizers for having us! The sessions, insightful tutorials, and the opportunity to connect and exchange ideas with so many fantastic people made it a memorable event.
For those who couldn't be there, here is a quick overview of what we presented:
While most large language models focus on English, we introduce two complementary, German-first model families: LLäMmlein, a decoder-only LLM (120M–7B parameters), and ModernGBERT, an encoder-only family (138M–1B parameters). Built for scalability and transparency, both are trained entirely from scratch on a German-only corpus using a custom German tokenizer. We describe our motivation for building fully German models, how we constructed the dataset and training pipeline, and the challenges we overcame in tokenization, corpus curation, and stable scaling across different model sizes and cluster configurations. For evaluation, we introduce SuperGLEBer, our German-specific benchmark, and report results across tasks to analyze how quality scales with model size and architecture. We release the data curation process, tokenizer, training recipes, checkpoints, and evaluation suite to support reproducibility and advance German-language AI.
