Intern
    Data Science Chair

    XAI Benchmarking for Cyber Security

    06.05.2025

    Investigate how to derive explanation-relevant features from simulated SIEM logs by analyzing known attack scenarios. Use publicly available annotated attack logs (e.g., from securitydatasets.com) to identify explanation-relevant patterns using data science methods. The goal is to build a feature annotation pipeline that supports future XAI evaluation in cyber security.

    Objective:

    Investigate how to derive explanation-relevant features from simulated SIEM logs by analyzing known attack scenarios. Use publicly available annotated attack logs (e.g., from securitydatasets.com) to identify explanation-relevant patterns using data science methods. The goal is to build a feature annotation pipeline that supports future XAI evaluation in cyber security.

    Betreuer: Daniel Schlör

    Key Tasks:

    • Select a small set of attack scenarios from securitydatasets.com (e.g., defense evasion, lateral movement) or other sources
    • Load and preprocess JSON/EVTX/sysmon logs
    • Analyze the logs to extract statistically or semantically relevant features (e.g., rare parent-child processes, registry writes, command line tokens)
    • Use interpretable ML models (e.g., decision trees, rule learning) to derive human-understandable decision criteria
    • Parse detection rules (e.g., from Sigma) to automatically extract key features referenced by SOC analysts and integrate in feature extraction pipeline
    • Create an (active learning) annotation pipeline
    • Annotate logs with candidate explanation features or tags

    Extension Directions (Master Thesis / Practica):

    • Semi-Automated Labeling Framework Using External Datasets
    • Explainability Evaluation Benchmarks Using Public Logs
    • Multi-Modal Knowledge Graph Construction

     

    Zurück