Consistent Annotation of WordNet using the Top Ontology
The Top Ontology (TO) (Alonge et al., 1998) is an indepedent hierarchy of features designed for clustering, comparing and exchanging concepts across languages in the EuroWordNet Project (Vossen, 1998). Furthermore, it has been usually used as a repository of lexical semantic information. Each WordNet synset has been annotated to one or more TO feature.
In the following link, we have the annotation of WordNet 1.6 with TO features (version 2.3):
The TO is also integrated into the Multilingual Central Repository (MCR).
The TCO consists in 63 features organized in three disjoint types of entities:
- 1stOrderEntity: physical things (image)
- 2ndOrderEntity: events, states and properties (image)
- 3rdOrderEntity: unobservable entities
Most of the subdivisions of the TO are disjoint categories: a concept cannot be both Natural and Artifact. Nevertheless, some of these inconsistences can be found when the TO features are inherited through the hyponymy hierarchy.
We can avoid the inheritance of disjoint categories including some blockage points in the hyponymy hierarchy paths. In this way, a consistent annotation of the nominal part of WorNet is obtained.
WordNet to TO Annotation Tools
We have developed a set of tools for checking the consistency of the annotation and also obtaining its expansion. For proving consistency, we check that there is no incompatiblity in the annotation of the nominal part of WordNet 1.6 to TO when using the blockage points. The expansion of the annotation can be obtained when the annotation is consistent. These tools have been implemented in Prolog and are available in the following links: [tar.gz] [zip]
We have gotten some interesting numeric conclusions from the TO annotation and the addition of the blockage points. For instance, every blockage point subsumes an average of 120.16 synsets; there are 28,123 synsets that have at least one blockage point in their hypernymy line.
This package is distributed under Attribution 3.0 Unported (CC BY 3.0) license. You can find it at http://creativecommons.org/licenses/by/3.0.
Álvez J., Atserias J., Carrera J., Climent S., Oliver A., Rigau G. Consistent Annotation of EuroWordNet with the Top Concept Ontology. Proceedings of the 4th Global WordNet Conference (GWC'08), Szeged, Hungary. January, 2008.
Álvez J., Atserias J., Carrera J., Climent S., Laparra E., Oliver A., Rigau G. Complete and Consistent Annotation of WordNet using the Top Concept Ontology. Proceedings of the 6th Language Resources and Evaluation Conference (LREC'08), Marrakech, Morocco. 2008.
Alonge A., Bertagna F., Bloksma L., Climent S., Peters W., Rodríguez H., Roventini A. and Vossen P. (1998) The Top-Down Strategy for Building EuroWordNet: Vocabulary Coverage, Base Concepts and Top Ontology. In Piek Vossen (ed.) EuroWordNet: A Multilingual Database with Lexical Semantic Networks. Kluwer Academic Publishers, Dordrecht.
Vossen P., (Ed.) (1998). EuroWordNet: A Multilingual Database with Lexical Semantic Networks. Kluwer Academic Publishers.