WSD of WordNet Glosses ---------------------- Word Sense Disambiguation of WordNet Glosses is an ongoing work aiming to automatically acquire new knowledge (semantic relations) from WordNet by performing WSD on the WordNet glosses. This package contains evaluation datasets derived from the WordNet Semantically Annotated gloss corpus (http://wordnet.princeton.edu/glosstag.shtml). For more details on this package, including references to the original resources, please consult the following paper: Chihuailaf R., Castillo M., Blanco G., Ledesma A. and Rigau G. Optimización del Algoritmo de WSD SSI-Dijkstra. Encuentro Chileno de Computación 2013 (ECC2013). Temuco, Chile. 2013. which can be downloaded at: http://adimen.si.ehu.es/~rigau/publications/jcc2013-ccblr.pdf Contents of the distribution ---------------------------- The current distribution of this package consists of the following files: README.txt README file wnet30+g.v3.random.contexts 933 WordNet glosses to evaluate automatic WSD algorithms on the completing glosses scenario wnet30+g.v3.random.1.contexts 933 WordNet glosses to evaluate automatic WSD algorithms on the new glosses scenario wnet30+g.v3.random.contexts.keys 933 key file with the answer keys for the Senseval scorer2 scoring program lkbs/wnet30_rels.txt WordNet relations (from WordNet 3.0) lkbs/wnet30g_rels.txt WordNet gloss relations (from Semantically Annotated gloss corpus) lkbs/wnet30g_rels_no_random_words.txt wnet30g_rels.txt without the 933 relations evaluated in wnet30+g.v3.random.contexts (for the completing glosses scenario) lkbs/wnet30g_rels_no_random_gloss.txt wnet30g_rels.txt without none of the gloss relations from wnet30+g.v3.random.contexts synsets (for the new glosses scenario) wnet30+g.v3.random.contexts =========================== This file contains the 933 WordNet glosses randomly selected from WordNet Semantically Annotated gloss corpus (http://wordnet.princeton.edu/glosstag.shtml). This file is used to evaluate automatic WSD algorithms on the completing glosses scenario. The format of the file corresponds to the WSD contexts of UKB (http://ixa2.si.ehu.es/ukb/). For instance, the following context is the first gloss of the wnet30+g.v3.random.contexts file. ctx_00031921-n(belong_to#v#wf3#02719930-v) 00031921-n#n#wf0#2 00002137-n#n#wf2#2 belong_to#v#wf3#1 00356926-a#a#wf6#2 00001740-n#n#wf9#2 part#n#wf11#1 together#r#wf12#1 This context corresponds to the nominal WordNet synset 00031921-n having sense and gloss: "relation : an abstraction belonging to or characteristic of two entities or parts together" Note that in the original semantically Annotated gloss corpus, "belong_to" appears disambiguated by the synset 02719930-v. Using this file, the task consist on predicting the correct synset for belong_to#v that according to the manual annotation is 02719930-v. In this case, part of the gloss is already disambiguated (e.g. abstraction#n with synset 00002137-n). wnet30+g.v3.random.1.contexts ============================= This file contains the same 933 WordNet glosses from the file wnet30+g.v3.random.contexts. However, this file is used to evaluate automatic WSD algorithms on the new glosses scenario. This file is used to evaluate automatic WSD algorithms on the new glosses scenario. The format of the file corresponds to the WSD contexts of UKB (http://ixa2.si.ehu.es/ukb/). For instance, the following context is the first gloss of the wnet30+g.v3.random.1.contexts file. ctx_00031921-n(belong_to#v#wf3#02719930-v) 00031921-n#n#wf0#2 abstraction#n#wf2#1 belong_to#v#wf3#1 characteristic#a#wf6#1 entity#n#wf9#1 part#n#wf11#1 together#r#wf12#1 The first line corresponds for the id of the context and the second line for the context itself. This context corresponds to the same nominal WordNet synset 00031921-n. Note that none of the words from the gloss is now disambiguated. In this scenario, we consider the gloss as new. That is, no disambiguation is provided except the synset which is being defined: 00031921-n. wnet30+g.v3.random.contexts.keys ================================ This is the key file with the answer keys for the Senseval scorer2 scoring program. For instance, the following line corresponds to the gold annotated answer for the id corresponding to the previous examples. ctx_00031921-n(belong_to#v#wf3#02719930-v) wf3 02719930-v License ------- This package is distributed under Attribution 3.0 Unported (CC BY 3.0) license. You can find it at http://creativecommons.org/licenses/by/3.0 Additional information ---------------------- Ongoing development work is done by a small group of researchers. Since our resources are VERY limited, we request that you check carefully this documentation and other resources to answer to your question or problem before contacting us. English Princeton WordNet: http://wordnet.princeton.edu Contact information ------------------- German Rigau IXA Group University of the Basque Country E-20018 San Sebastián Version 1.0. Last updated: 2014/09/26