Deutsch Intern
    Data Science Chair

    Our paper "CapsKG: Enabling Continual Knowledge Integration in Language Models for Automatic Knowledge Graph Completion" has been accepted at ISWC 2023

    07/17/2023

    We show that architectures from the machine learning domain of continual learning can help language models to procedurally learn facts about the world.

    We propose to use a special neural network architecture that can extend existing language models such as BERT and help them to continually learn new factual knowledge. We will present our work at the International Semantic Web Conference 2023 (ISWC).

    Abstract

    Automated completion of knowledge graphs is a popular topic in the Semantic Web community that aims to automatically and continuously integrate new appearing knowledge into knowledge graphs using artificial intelligence. Recently, approaches that leverage implicit knowledge from language models for this task have shown promising results. However, by fine-tuning language models directly to the domain of knowledge graphs, models forget their original language representation and associated knowledge. An existing solution to address this issue is a trainable adapter, which is integrated into a frozen language model to extract the relevant knowledge without altering the model itself. However, this constrains the generalizability to the specific extraction task and by design requires new and independent adapters to be trained for new knowledge extraction tasks. This effectively prevents the model from benefiting from existing knowledge incorporated in previously trained adapters.

    In this paper, we propose to combine the benefits of adapters for knowledge graph completion with the idea of integrating capsules, introduced in the field of continual learning. This allows the continuous integration of knowledge into a joint model by sharing and reusing previously trained capsules. We find that our approach outperforms solutions using traditional adapters, while requiring notably fewer parameters for continuous knowledge integration. Moreover, we show that this architecture benefits significantly from knowledge sharing in low-resource situations, outperforming adapter-based models on the task of link prediction. 

    Back