Page MenuHomePhabricator

Wiktionary needs usable API
Open, NormalPublic

Description

I can't seem to find a pre-existing bug, but Wiktionary needs a usable API.

Currently Wiktionary relies on MediaWiki's api.php, but that was (largely) built for Wikipedia. A proper Wiktionary API would allow retrieving definitions in a particular language from a language version of Wiktionary. Probably a few other things as well. ;-)


Version: unspecified
Severity: enhancement

Details

Reference
bz36881

Related Objects

StatusAssignedTask
Declineddchen
OpenNone
OpenNone
DuplicateNone
OpenNone
ResolvedAbit
OpenNone
DuplicateNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
StalledNone
ResolvedLydia_Pintscher
ResolvedLydia_Pintscher
ResolvedLydia_Pintscher
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
ResolvedLydia_Pintscher
ResolvedLydia_Pintscher
ResolvedLydia_Pintscher
Resolved Addshore
Resolved Addshore
Resolved Addshore
ResolvedLydia_Pintscher
Resolved Addshore
Resolved Addshore
Resolvedjcrespo
Resolved Addshore
Resolved Addshore
ResolvedBawolff
Resolved Addshore
Resolved Addshore
Resolved Addshore
Resolved Addshore
Resolved Addshore
DuplicateWMDE-leszek
ResolvedWMDE-leszek
Resolved Addshore
Resolved Addshore
Resolved Addshore
Resolved Addshore
Resolved Addshore
Resolved Addshore
Resolved Addshore
ResolvedMarostegui
Resolved Addshore
Resolved Addshore
DeclinedNone
DeclinedNone
DeclinedNone
ResolvedLydia_Pintscher
ResolvedLydia_Pintscher
ResolvedJakob_WMDE
ResolvedLadsgroup
ResolvedLadsgroup
OpenNone
ResolvedLadsgroup
ResolvedLydia_Pintscher
ResolvedLadsgroup
ResolvedLadsgroup
ResolvedJakob_WMDE
ResolvedWMDE-leszek
Resolvedthiemowmde
ResolvedJakob_WMDE
ResolvedLadsgroup
Declinedthiemowmde
Declinedthiemowmde
Resolvedthiemowmde
Resolvedhoo
ResolvedLydia_Pintscher
DuplicateNone
ResolvedJakob_WMDE
ResolvedJakob_WMDE
OpenNone
ResolvedJakob_WMDE
ResolvedJakob_WMDE
ResolvedWMDE-leszek
OpenJakob_WMDE
ResolvedLydia_Pintscher
OpenNone
Resolved Aleksey_WMDE
InvalidNone
Resolvedthiemowmde
ResolvedLadsgroup
OpenNone
OpenNone

Event Timeline

bzimport raised the priority of this task from to Normal.Nov 22 2014, 12:26 AM
bzimport set Reference to bz36881.
bzimport added a subscriber: Unknown Object (MLST).

This should be a tracking bug. But I don't know of any other issues to put here.

Qgil added a comment.Mar 25 2013, 1:02 AM

This idea has been suggested by Siebrand as a potential Google Summer of Code projects at http://www.mediawiki.org/wiki/Mentorship_programs/Possible_projects#Wiktionary_APIs

Does this make sense? Has there been any discussion in the Wiktionary community about specific API needs? I just want to know whether we would have a roughly defined project for a student. If the students should start by going to English Wiktionary and ask then this is not a feasible project proposal for GSOC 2013.

https://www.mediawiki.org/wiki/Summer_of_Code_2013#Project_ideas

If the idea makes sense we would also need at least one mentor.

(In reply to comment #2)

This idea has been suggested by Siebrand as a potential Google Summer of Code
projects at
http://www.mediawiki.org/wiki/Mentorship_programs/
Possible_projects#Wiktionary_APIs
Does this make sense? Has there been any discussion in the Wiktionary
community
about specific API needs? I just want to know whether we would have a roughly
defined project for a student. If the students should start by going to
English
Wiktionary and ask then this is not a feasible project proposal for GSOC

https://www.mediawiki.org/wiki/Summer_of_Code_2013#Project_ideas
If the idea makes sense we would also need at least one mentor.

Note Ive previously tried to do this. Well part of the reason my attempt semi failed was that I was a newbie at the time I would like to state this is not the easiest problem to solve (esp. If you intend to keep wiktionary the same as it is currently without any explicit machine readable annotations)

Btw for reference my http://en.wikinews.org/w/index.php?title=User:Bawolff/sandbox/Wiktionary_query (don't view on mobile site)

Its not exactly an api, but does similar things to an api. Part of the reason it sucks so much were naive design choices that were horrid (younger me was stupid. If you read the code don't judge too hard). Anyhow as a result of my experiance with that, I wouldn't reccomend this as a gsoc project unless the student already had quite a bit of proper experiance with parsing.

wmf.amgine3691 wrote:

Side note: the usual first approach to this is look at existing dictionary api standards. There are a large number of existing, mostly proprietary, systems currently in production using en.Wiktionary mapped to existing standards. There are almost no efforts doing so with other languages.

If someone would just implement RFC 2229, that would be awesome. https://tools.ietf.org/html/rfc2229

Alternatively, make the api calls as compatible as possible with that RFC.

wmf.amgine3691 wrote:

A couple hours doodling for projects using wiktionary content, particularly going to DICT or WordNet, or code discussions on parsing wiktionary content (very popular whinge topic on stackoverflow):

https://github.com/onny/wikidict
http://extensions.libreoffice.org/extension-center/hunspell-is-the-icelandic-spelling-dictionary-project
http://www.trustlet.org/wiki/Wik2dict

(svn checkout http://wik2dict.googlecode.com/svn/trunk/ wik2dict-read-only)

https://code.google.com/p/wikokit/
http://inamidst.com/phenny/modules/wiktionary.py (ircbot module extracting data/metadata from wiktionary)
http://stackoverflow.com/search?q=wiktionary
http://goldendict.org/forum/viewtopic.php?f=5&t=1205 (en.WT for GoldenDict)
http://godlewski.free.fr/wiktionary-dict/
https://www.mediawiki.org/wiki/User:Gautham_shankar/Gsoc

Scholarly works:
http://scholar.google.ca/scholar?hl=en&q=Wiktionary&btnG=&as_sdt=1%2C5&as_sdtp=
http://www.aaaipress.org/Papers/AAAI/2008/AAAI08-137.pdf (Using Wiktionary for Computing Semantic Relatedness)
http://www.ukp.tu-darmstadt.de/data/lexical-resources/wordnet-wiktionary-alignment/

  • Bug 21450 has been marked as a duplicate of this bug. ***

To make this happen Wiktionary needs to store its data in a structured and machine readable format. We have proposals for how to make this happen at https://www.wikidata.org/wiki/Wikidata:Wiktionary/Development. Once that is done the API will be done as well.

mxn added a subscriber: mxn.Nov 24 2014, 9:00 PM
Ricordisamoa added a subscriber: Ricordisamoa.
Reedy set Security to None.Nov 29 2015, 1:59 PM
Reedy removed a subscriber: wikibugs-l-list.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptNov 29 2015, 1:59 PM
Reedy moved this task from Unsorted to Non-core-API stuff on the MediaWiki-API board.
dg711 added a subscriber: dg711.Dec 30 2015, 1:19 AM
Meno25 removed a subscriber: Meno25.Feb 8 2016, 7:40 PM

There is now an experimental API end point for wiktionary definitions at https://en.wiktionary.org/api/rest_v1/?doc#!/Page_content/get_page_definition_term

This API is used by the Android app to provide definitions for words using wiktionary data, but it is currently only available for the English Wiktionary. T138709 discusses ways to expand coverage to other languages by adding standard markup to consistently identify specific components of the definitions. Please chime in there.

Restricted Application added a project: Wikidata. · View Herald TranscriptOct 4 2016, 9:49 AM
Lydia_Pintscher moved this task from incoming to hold on the Wikidata board.Mar 22 2017, 11:34 AM
dardo82 rescinded a token.
dardo82 awarded a token.
dardo82 added a subscriber: dardo82.
He7d3r added a subscriber: He7d3r.Feb 6 2018, 3:27 PM
He7d3r awarded a token.Feb 6 2018, 3:31 PM