M0041: Bulgarian WordNet

The Bulgarian WordNet was developed by the Department for Computational Linguistics at the Institute for Bulgarian Language, Bulgarian Academy of Sciences, initially within the framework of the BalkaNet project “Multilingual Semantic Network for the Balkan Languages” (IST-2000-29388) and later on under the scope of the BulNet project, funded at the national level. For more information about the BalkaNet project, please visit the project web site, and the Department for Computational Linguistics web site.

The Bulgarian WordNet models nouns, verbs, adjectives, and (occasionally) adverbs, and contains 23,715 word senses (synsets). Every synset encodes the equivalence relation between several literals (at least one is present), having a unique meaning (specified in the SENSE tag value), belonging to one and the same part of speech (specified in the POS tag value), and expressing the same lexical meaning. Each synset is related to the corresponding synset in the English WordNet 2.0. via its identification number ID. There is at least one language-internal relation (there could be more) between a synset and another synset in the database.

The Bulgarian WordNet is a language-internal structure, minimally containing:

  • set of variants or synonyms making up the synset;
  • part-of-speech;
  • language-internal relations to other synsets;
  • a unique-id linking the synset to the English Wordnet 2.0.

Number of Synsets 23 715
Number of Literals 51 011
Domain specific synsets 1 863
Lexico-semantic relations 41 620
Extralinguistic relations 197

The Bulgarian WordNet is distributed without: