Multilingual Central Repository

The current version of the Multilingual Central Repository (MCR) (Atserias et al. 04, Gonzalez-Agirre et al. 12) is a result of the 5th Framework MEANING project (IST-2001-34460) and Spanish government KNOW (TIN2006-15049-C03), KNOW2 (TIN2009-14715-C04-01) projects and the ongoing SKaTer (TIN2012-38584-C06) projects..

The MCR integrates in the same EuroWordNet framework wordnets from five different languages: English, Spanish, Catalan, Basque and Galician. The Inter-Lingual-Index (ILI) allows the connection from words in one language to equivalent translations in any of the other languages thanks to the automatically generated mappings among WordNet versions. The current ILI version corresponds to WordNet 3.0. Furthermore, the MCR is enriched with the semantically tagged glosses.

The MCR also integrates WordNet Domains, new versions of the Base Concepts and the Top Ontology, and the AdimenSUMO ontology.

In that way, the MCR constitutes a natural multilingual large-scale semantic resource for a number of semantic processes that need large amount of multilingual knowledge to be effective tools.

News

This script transforms the Multilingual Central Repository (MCR) 3.0 database so that it can be loaded using the NLTK WordNet reader.

The MCR is integrated into the Open Multilingual WordNet innitiative, BabelNet, and used by Google.

MCR using WordNet 3.0 as ILI

The current version of the MCR (using WordNet 3.0 as ILI) can be consulted using the Web EuroWordNet Interface (consult mode).

The current version of the MCR (using WordNet 3.0 as ILI) can be also edited also edited using the Web EuroWordNet Interface (edit mode).

Download the MCR 3.0  and install it as an SQL database (for both MySQL and PostgreSQL):  [zip (37M)]  [tar.gz (37M)] [bz2 (32M)] [lzma (21M)]

The MCR 3.0 is distributed under 3 different licenses:

  1. The English WordNet synset and relation data, contained in folder engWN/ are distributed under the original WordNet license. You can find it at http://wordnet.princeton.edu/wordnet/license
  2. The Basque WordNet synset and relation data, contained in folder eusWN/ are  distributed under CreativeCommons Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0) license. You can find it at  http://creativecommons.org/licenses/by-nc-sa/3.0
  3. All other data in this package are distributed under Attribution 3.0 Unported (CC BY 3.0) license. You can find it at http://creativecommons.org/licenses/by/3.0

For further information about the MCR 3.0 please check this README, this wiki or the MCR mailing list.

If you use the MCR 3.0, please refer to the following publication:

Gonzalez-Agirre A., Laparra E. and Rigau G. Multilingual Central Repository version 3.0: upgrading a very large lexical knowledge base. In Proceedings of the Sixth International Global WordNet Conference (GWC’12). Matsue, Japan. January, 2012.

MCR using WordNet 1.6 as ILI

The previous version of the MCR also contains six different versions of the English WordNet (from 1.6 to 3.0) together with more than one million of semantic relations between synsets comming from WordNet, eXtended WordNet, and Selectional Preferences acquired from SemCor. This version of the MCR also contains a version of the Italian WordNet.

The previous version of the MCR (using WordNet 1.6 as ILI) can be consulted using the Web EuroWordNet Interface (consult mode).

The previous version of the MCR (using WordNet 1.6 as ILI) can be also edited using the Web EuroWordNet Interface (edit mode).

Publications

Atserias J., Villarejo L., Rigau G., Agirre E., Carroll J., Magnini B. and Vossen P. The MEANING Multilingual Central Repository. In Proceedings of the Second International Global WordNet Conference (GWC’04). ISBN 80-210-3302-9. Brno, Czech Republic. January, 2004.

Gonzalez-Agirre A., Laparra E. and Rigau G. Multilingual Central Repository version 3.0: upgrading a very large lexical knowledge base. In Proceedings of the Sixth International Global WordNet Conference (GWC’12). Matsue, Japan. January, 2012.

Gonzalez-Agirre A., Laparra E. and Rigau G. Multilingual Central Repository version 3.0. 8th international conference on Language Resources and Evaluation (LREC'12). Istambul, Turkey. 2012.

González-Agirre A. and Rigau G. Construcción de una base de conocimiento léxico multilíngüe de amplia cobertura: Multilingual Central Repository. Linguamática. Revista para o Processamento Automático das Línguas Ibéricas Vol. 5(1). 13-28 - ISSN: 1647-0818. 2013.