Page MenuHomePhabricator

[spike] Can the page/related endpoint be replaced by a single MW API call?
Closed, ResolvedPublic

Description

The page/related endpoint is being used by our MobileHtml client-side javascript to construct the "Read More" items seen at the bottom of an article.

This endpoint basically does nothing more than a morelike: search, followed by page/summary queries for each returned page, finally returning a list of summary objects.

Instead of relying on this endpoint, can we get away with a single MW API call to get the same data, and construct the "Read More" items in the same way?

Event Timeline

JTannerWMF added subscribers: MSantos, JTannerWMF.

@MSantos how soon do you need this so that we can prioritize it with our other work?

@Dbrant can you take the lead on this one.

The short answer is: absolutely.

We can use literally the same API call that is used by Mobile Web to load their "read more" items:
https://en.m.wikipedia.org/w/api.php?action=query&formatversion=2&origin=*&prop=pageimages%7Cdescription&piprop=thumbnail&pithumbsize=160&pilimit=3&generator=search&gsrsearch=morelike%3ASaturn&gsrnamespace=0&gsrlimit=3&gsrqiprofile=classic_noboostlinks

^ It's a single API call that provides three items with all the information necessary to render them.

The one tiny difference with our page/related logic is that we also use the "extract" field from the page summary, in the following edge cases: If the article is missing a description, or if the description is less than 10 characters (?), then we plug in the extract in its place.

To which I would offer a couple of possibilities:

  1. We can easily modify our single API call to include an extract provided by MediaWiki itself:

https://en.m.wikipedia.org/w/api.php?action=query&formatversion=2&origin=*&prop=pageimages%7Cdescription%7Cextracts&piprop=thumbnail&pithumbsize=160&pilimit=3&exchars=100&exintro=1&explaintext=1&generator=search&gsrsearch=morelike%3ASaturn&gsrnamespace=0&gsrlimit=3&gsrqiprofile=classic_noboostlinks

However, this might create additional load on the server, and/or create a bit more latency.

-or-

  1. We can no longer use the extract for our edge case of no-description, and switch purely to the API call that Mobile Web already makes. (i.e. If a "Read more" item is missing a description, then it will just be a Title + thumbnail.)

^ cc @JTannerWMF

Feel free to go forward with option 2!

Thanks, @Dbrant, the API looks good to me!

Just have one concern about the language variants on zh.wiki
https://zh.wikipedia.org/w/api.php?action=query&exchars=100&exintro=1&explaintext=1&formatversion=2&generator=search&gsrlimit=3&gsrnamespace=0&gsrqiprofile=classic_noboostlinks&gsrsearch=morelike%3ASaturn&origin=*&pilimit=3&piprop=thumbnail&pithumbsize=160&prop=pageimages|description|extracts&format=json

The first article title from the response is from the default title in zh.wiki, and it does not change to the target variant if I set Accept-Language: zh-hant to the header.

zh: 卡西尼-惠更斯号
zh-hant: 卡西尼-惠更斯號

Do you think it is possible to set a specific language code and get the correct title from the API call?

Not sure if there's a way to get the list of descriptions in different variants, similar to the varianttitles? or do we need to make another API call for that?

I'm not sure this is possible in a single request. But the REST endpoint wasn't doing this correctly either, was it?

Not sure if there's a way to get the list of descriptions in different variants, similar to the varianttitles? or do we need to make another API call for that?

I'm not sure this is possible in a single request. But the REST endpoint wasn't doing this correctly either, was it?

You are correct. This has been a known issue across the app and looks like the endpoint always gets description from the parent language code only.

Change 982154 had a related patch set uploaded (by Dbrant; author: Dbrant):

[mediawiki/services/mobileapps@master] [WIP] No longer use page/related endpoint, and load lazily.

https://gerrit.wikimedia.org/r/982154

Hi @Dbrant

Response from T353199#9398211, we can use variant=zh-tw or another language variant code for having the correct extract.

Similar to how it's done by the RelatedArticles extension could you include smaxage=86400&maxage=86400 to make this API request cacheable?
In a similar vein would it be possible to ask 3 results instead of 5? This way you might share the same internal MW cache as the RelatedArticles extension?

could you include smaxage=86400&maxage=86400 to make this API request cacheable? In a similar vein would it be possible to ask 3 results instead of 5?

Great points @dcausse, will do!

@JTannerWMF I missed the comment about timelines. What do you think would be reasonable timeline for you to work on both iOS and Android?

@JTannerWMF I missed the comment about timelines. What do you think would be reasonable timeline for you to work on both iOS and Android?

FWIW we will review the WIP @Dbrant created a while ago and go from there if we need your support on it I'll let you know.

MSantos raised the priority of this task from Low to Medium.Apr 29 2024, 11:28 AM
MSantos moved this task from Backlog to Code Review on the Content-Transform-Team-WIP board.

Change #982154 merged by jenkins-bot:

[mediawiki/services/mobileapps@master] No longer use page/related endpoint, and offer lazy loading.

https://gerrit.wikimedia.org/r/982154