Task #16: Evaluation of wide Coverage Knowledge Resources

Organized by:

Description of the task

Using large-scale semantic knowledge bases (such as WordNet) has become a usual, often necessary, practice for most current Natural Language Processing systems. Even now, building large and rich enough knowledge bases for broad--coverage semantic processing takes a great deal of expensive manual effort involving large research groups during long periods of development.

For instance, in more than eight years of manual construction (from version 1.5 to 2.0), WordNet passed from 103,445 semantic relations to 204,074 semantic relations (symmetric relations are counted only once). That is, around twelve thousand semantic relations per year.

However, during the last years the research community have devised a large set of innovative processes and tools for large-scale automatic acquisition of lexical knowledge from structured or unstructured corpus. For instance, the Multilingual Central Repository (MCR) contains 1.6 million semantic relations between synsets, most of them acquired by automatic means.

This task tries to establish the relative quality of the semantic resources available (derived by manual or automatic means) in a neutral environment. The quality of each large-scale knowledge resource will be indirectly evaluated on a Word Sense Disambiguation task. In particular, we will use the English Lexical Sample task as evaluation benchmark to evaluate the relative quality of each resource (see task 17, subtask 1). Furthermore, trying to be as neutral as possible with respect the knowledge bases studied, we will apply sistematically the same disambiguation method to all the resources. That is, using the word sense knowledge contained in the resources provided by the participants as topical knowledge.

Furthermore, this task will study how these resources complement each other. That is, to which extent each knowledge base provides new knowledge not provided by the others.


Full Details of Task
Last updated:  February 1, 2007