Task #16: Evaluation of wide Coverage Knowledge Resources

Mailing list

There is a mailing list available for those interested in take part in this task. Please subscribe in the form below if you are interested in receive information about the task details and receive quick reply from the organizers.

You can browse the e-mail discussion on the task.
To join enter your e-mail here:

Datasets and Formats

The organizers will provide a list of word-senses corresponding to WordNet 2.1.

In a small period of time, the participants should return a set of Topic Signatures (lists of words and associated weights) corresponding to the list of word-senses provided by the organizers. The format of the Topic Signatures should follow the format of the trial data-set.

Each participant should provide a zip or tar.gz file containing a set of files. Each file should contain a Topic Signature for a particular word-sense (that is, those words that better characterize the word-sense). The filename should follow the format word#pos#sense (for instance, party#n#1) indicating the word-sense of WordNet 2.1.

Topic Signature Format

Each file corresponding to a word sense will include a word weighted list of terms. Each line will correspond to a word with its associated weight (which could represent the weight of the relation between the word and the word-sense).

Example for filename party#n#1:
democratic 0.0126629522055422
tammany 0.012422452683697
alinement 0.0122449890739299
federalist 0.0115942891714506
missionary 0.0103391657635677


In order to measure the relative quality of the knowledge resources submitted for the task, we will perform an indirect evaluation by using all the resources delivered as Topic Signatures (TS). That is, word vectors with weights which are associated to a particular WordNet synset. A weight of 1 will be given to each word in the TS for those words without weights. This simple representation tries to be as neutral as possible with respect to the evaluation framework.

All knowledge resources submitted for the task will be indirectly evaluated on a common Word Sense Disambiguation task. In particular, we will use the English Lexical Sample framework of SemEval-07 (task 17, subtask 1). That is, using a limited number of words and annotated test examples for their word senses. All performances will be evaluated on the test data using the scoring system provided by the Lexical Sample task organizers.

Furthermore, trying to be as neutral as possible with respect to the semantic resources submitted, we will apply systematically the same disambiguation method to all of them. Recall that our main goal is to establish a fair comparison of the knowledge resources rather than providing the best disambiguation technique for a particular semantic knowledge base.

A simple word overlapping counting (or weighting) will be performed between the Topic Signature and the test example. We will also consider multiword terms. Thus, the occurrence evaluation measure will count the amount of overlapping words and the weight evaluation measure will add up the weights of the overlapping words. The synset having higher overlapping word counts (or weights) will be selected for a particular test example.

The participants, having wide-coverage semantic resources associated to WordNet, will receive a list of word senses. In a very short time, they should submit the appropriate Topic Signatures for the corresponding word senses.

We will also stablish a set of baselines using existing wide coverage semantic resources such as WordNet, MCR, etc.

Download area

Trial Data

There is available the trial data using the Topic Signatures format.



Test Data

There is available the test data:



System and Results

This section will be completed after the competition.


Montse Cuadros and German Rigau. Quality Assessment of Large-scale Knowledge Resources. EMNLP'2006.
Sidney, Australia.

Last updated:  February 1, 2007

 For more information, visit the SemEval-2007 home page.