Add route to provide Wiktionary definition of requested word or phrase.
Closed, ResolvedPublic3 Story Points

Description

This task is to add a route to the Content Service that returns a Wiktionary definition of the provided term. This will likely involve fetching the mobileview response from Wiktionary, and applying some DOM cleanup. This can be constrained to enwiki for now.

Dbrant created this task.Nov 20 2015, 8:23 PM
Dbrant updated the task description. (Show Details)
Dbrant raised the priority of this task from to Needs Triage.
Dbrant moved this task to Next Sprint on the Wikipedia-Android-App-Backlog board.
Dbrant added subscribers: Dbrant, Niedzielski, Mholloway, bearND.
Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald TranscriptNov 20 2015, 8:23 PM
Mholloway set Security to None.
Mholloway moved this task from Backlog to Doing on the Mobile-Content-Service board.
Mholloway triaged this task as Normal priority.
Mholloway removed a subscriber: Mholloway.

Change 255263 had a related patch set uploaded (by Mholloway):
WIP: Mobile content service route for Wiktionary definitions

https://gerrit.wikimedia.org/r/255263

Interesting! Wiktionary or just English Wiktionary?

@Nemo_bis Hi! The WIP patch is English-only for now, but this feature is planned for all languages.

MBinder_WMF edited a custom field.
MBinder_WMF edited a custom field.

@GWicke and @mobrovac, what are your thoughts on an endpoint for this?
In the current patch it is just {domain}/v1/definition/{term}.

Change 255263 merged by jenkins-bot:
Create mobile content service route for Wiktionary definitions

https://gerrit.wikimedia.org/r/255263

@bearND: How close is this to the "summary" response? Would using /{domain}/v1/page/summary/{term} for wiktionary make sense, or do you foresee a separate use case for a summary on wiktionaries? If so, /{domain}/v1/page/definition/{term} could work with the existing URL layout as well.

Why do you propose URLs such as /{domain}/v1/page/summary/{term}? Where is the language code? I thought the main point of parsing a Wiktionary entry was to extract text from the correct section/sub-lemma, but I now notice the task description is unclear on requirements. Is this supposed to work only for the "monolingual" entries?

GWicke added a comment.EditedDec 30 2015, 7:44 PM

@Nemo_bis: The language code is part of the domain. We already have the summary entry point on wiktionaries (example), but its extract property currently looks less useful than in a typical wikipedia summary.

I should also explain that the pattern /{domain}/v1/... is basically only used internally now. The public API uses https://{domain}/api/rest_v1/....

The language code is part of the domain

Uh? I'm talking of the language of the lemma. For instance https://en.wiktionary.org/wiki/lemma is a page which has content language "en", default interface language "en" and lemmas for en, cz, fi, it, la, sv.

Hi @Nemo_bis,

The dialogs (and the service to support them) are designed according to the mocks at T114949. The idea isn't to provide the content of a Wiktionary page in full, but rather, in the interest of user-friendliness, to provide a focused subset of the content that's likely to be most relevant to the user, namely the definitions for the selected word in the current page language, along with their supporting examples.

@GWicke, following on my note to @Nemo_bis above, this is why I omitted the /page/ section from the path of the Wiktionary endpoint in the recently merged mobile content service patch, leaving it at {domain}/v1/definition/{term}: this endpoint won't be serving a page, per se, but rather a fairly narrowly focused subset of a Wiktionary page's content. But if there's a reason to keep /page/ in the path, technical or otherwise, I don't object to putting it back.

GWicke added a comment.EditedDec 31 2015, 4:58 PM

@Mholloway, the main reason for including the page prefix would be consistency with other page-related information, such as page/html, page/summary, page/title and so on. This limits top-level groups to (currently) page, media, metrics and transform.

Change 261775 had a related patch set uploaded (by Mholloway):
Put page/ back in endpoint path

https://gerrit.wikimedia.org/r/261775

@GWicke, makes sense. I just pushed a patch to add /page back in, so the endpoint will be deployed as {domain}/v1/page/definition/{term}.

Change 261775 merged by Mobrovac:
Put page/ back in endpoint path

https://gerrit.wikimedia.org/r/261775

Dbrant closed this task as Resolved.Jan 18 2016, 3:46 PM

Please notify this thing to the English Wiktionary's grease pit and to Wiktionary-l. I'm not sure what the thing actually is, so I'm unable to do myself. Thanks.

Please notify this thing to the English Wiktionary's grease pit and to Wiktionary-l. I'm not sure what the thing actually is, so I'm unable to do myself. Thanks.

Ping

@Nemo_bis
The Wiktionary endpoint is still very much experimental, and is subject to change. One of the ongoing goals for the Android app is to integrate more rich content into the browsing experience. One such feature is to allow the user to highlight words in an article and see a quick popup definition of the word from Wiktionary (T115484). To facilitate this action, we set up a RESTBase endpoint for fetching the desired term from Wiktionary (T119235).

This feature is currently only available in the Wikipedia Beta app, and is restricted only to English wiktionary. Further work on this endpoint will depend on the level of user engagement with the feature, once it's rolled out to the main Wikipedia app. So, once again, even though we're building the endpoint with the hope that it would be used by other consumers besides the Android app (and expanded to all languages), at the moment it's by no means ready for general consumption.