Genlib Rus 'link' (FHD)

стол S masc inan 2a # lemma, POS, gender, animacy, paradigm ID стола столе столу столом столе ... # full paradigm forms Or in a table format:

1. Overview GenLib (short for Generation Library ) is a comprehensive morphological database and software library for the Russian language. It is designed to support natural language processing (NLP) tasks that require accurate inflection, lemmatization, and morphological generation. Unlike simple dictionary lists, GenLib provides full inflectional paradigms for hundreds of thousands of Russian lexemes. genlib rus

Originally developed by and later maintained/extended by the Russian computational linguistics community (including contributions to projects like AOT and Dialogue evaluations), GenLib is one of the key open resources for Russian morphology alongside OpenCorpora and Zaliznyak’s grammatical dictionary. 2. Key Features | Feature | Description | |---------|-------------| | Large coverage | > 160,000 lexical entries (depending on version), covering nouns, adjectives, verbs, pronouns, numerals, and participles. | | Paradigm-based | Each lexeme is assigned to a morphological class (e.g., declension type, conjugation group). | | Bidirectional | Supports both generation (wordform → paradigm) and analysis (lemma + grammemes ← wordform). | | Grammatical tagging | Uses a tagset with part of speech, case, number, gender, animacy, tense, person, mood, aspect, etc. | | Accent/marking | Some versions include stress marks for correct pronunciation and poetic scansion. | 3. Data Format (Example) GenLib is typically distributed as a set of plain text files or a SQLite database. A simplified entry looks like: стол S masc inan 2a # lemma, POS,

Lemma: мышь Paradigm class: 8e (from Zaliznyak notation, but GenLib uses its own IDs) Output: мышь, мыши, мыши, мышь, мышью, о мыши (singular) мыши, мышей, мышам, мышей, мышами, о мышах (plural) GenLib would produce these without storing each wordform explicitly. GenLib is a foundational resource for Russian morphology generation . While newer tools (like pymorphy2 with OpenCorpora) dominate lemmatization and analysis tasks, GenLib remains indispensable when the goal is synthesizing correct inflected forms from lemmas – especially for rule-based text generation, serious games, and educational software. Its paradigm-driven design is both elegant and efficient, ensuring that Russian NLP developers do not have to reinvent inflection tables from scratch. Last updated: 2026 – GenLib continues to be used in legacy and academic projects, though active development has shifted to integrated frameworks like Natasha, DeepPavlov, and fine-tuned BERT-based morph taggers. It is designed to support natural language processing