Page MenuHomePhabricator

Export Wiktionnaire in dictionary formats
Open, MediumPublic

Description

Wiktionaries are semi-structured, so it is difficult to use their data without relying on custom parsers.

Anagrimes should provide a tool to export fr.wikt data to file formats readable by electronic dictionaries, like stardict.

Event Timeline

Darkdadaah claimed this task.
Darkdadaah raised the priority of this task from to Needs Triage.
Darkdadaah updated the task description. (Show Details)
Darkdadaah added a project: Tool-anagrimes.
Darkdadaah moved this task to Features requests on the Tool-anagrimes board.
Darkdadaah added a subscriber: Darkdadaah.
Darkdadaah set Security to None.

Here is a list of xdxf files created from a fr.wikt dump : https://tools.wmflabs.org/anagrimes/data/xdxf/.

Tested with golddict, some notes :

  • abbreviations meanings seem to be ignored (xdxf format issue probably)
  • language codes are not standard in xdxf (e.g. FRE for French)
  • Only languages above 1000 words are listed
  • Flexions are excluded by default (so as to reduce file sizes)
  • Graphy variants are not taken into consideration (e.g. clef = clé, different pages)

Nonetheless, it is usable.

@Darkdadaah: I am resetting the assignee of this task because there has not been progress lately (please correct me if I am wrong!).
Resetting the assignee avoids the impression that somebody is already working on this task. It also allows others to potentially work towards fixing this task.
Please claim this task again when you plan to work on it (via Add Action...Assign / Claim in the dropdown menu) - it would be welcome! Thanks for your understanding!