https://www.wikidata.org/wiki/Wikidata:Wiktionary/Development/Proposals
Phase 1: Lexicons
This phase requires parsing the existing pages for some basic information. There appears to be clear consensus on the desired data model, but extracting the data will take some work.
Existing Wiktionary structure has language as the next level of hierarchy under representation, with lexeme being determined by a split on etymology and then lexical category. For the most part, form and grammatical category are only differentiated by sense or conjugation tables.
Data model
- lexeme (L)
- language
- lexical category
- form (F)
- grammatical category
- representation (R)
- script