Bib100

The Bib100 Dataset

Overview

The Bib100 Evaluation Dataset contains 100 pairs of English words along with human-assigned relatedness judgments. It can be used for training and testing of semantic relatedness measures.

Description

The 100 pairs are composed of 122 English words and were collected from the top 3000 tags of the social tagging system BibSonomy.

The relatedness scores were collected from 26 test subjects. Each test subject was shown all word pairs from this dataset and had to judge the relatedness on a scale of 0 (unrelated) to 10 (synonymous).

All scores were collected from native English speakers, using the crowdsourcing platform MicroWorkers.

Download

The data are available at

Bib100 dataset

(4,3 kB)

For any questions, refer to Thomas Niebler.

Hubland Nord

Bildnachweise