Intern
    Data Science Chair

    GermEval@Konvens’ 25 Participation - First Place in two out of three Harmful Content Detection tasks!

    22.09.2025

    We participated in all GermEval tasks and placed first in two of the three Harmful Content Detection tasks!

    After the successes for SemEval Challenge we participated in this years GermEval by implementing them in our SuperGLEBer benchmark.We participated in all GermEval tasks (Flausch-Erkennung, Harmful Content Detection, LLMs4Subjects, SustainEval), placing first in two of the three Harmful Content Detection tasks and achieving strong results in SustainEval, despite the simplicity of our approach.

    Abstract (paper):

    We participate in this year's GermEval 2025 Shared Tasks by extending SuperGLEBer, a comprehensive benchmark for evaluating German language understanding to the new tasks. Rather than focusing on optimizing task-specific performance, we adopt a complementary strategy: applying simple methods across 38 diverse (L)LMs (100M to 9B parameters) to analyze the tasks themselves, revealing that most models perform similarly on this year's tasks compared to existing SuperGLEBer tasks. Notably, the regression-based verifiability rating task diverges from this trend, emerging as substantially more difficult and methodologically distinct.Through our comprehensive analysis, we find that three new tasks, including Flausch-Erkennung subtask 2, rank among the top 10 most discriminating tasks of the benchmark, effectively distinguishing between model capabilities. Most remarkably, we demonstrate that just 2-3 strategically selected tasks can approximate the complete benchmark rankings with 97-99\% correlation, potentially enabling more efficient large-scale model evaluation while maintaining ranking accuracy. Overall, our submissions achieved competitive results, placing 1st (out of one)-6th across different tasks, i.e.\ for Flausch-Erkennung subtask 1 and 2 we placed 3rd and 6th respectively.

     

    Additionally, we presented an overview of our contributions to the German NLP landscape.

    Zurück