Page MenuHomePhabricator

Improve /page/related/ response time in zhwiki
Open, LowPublic

Description

The response time of /page/related/ endpoint in zhwiki is obviously slower, by comparing with the endpoint in enwiki.

For example, it takes ~10 seconds to show the response in zhwiki.
https://zh.wikipedia.org/api/rest_v1/page/related/台灣
with Accept-Language: zh-hant.

Screenshot at Jan 05 09-26-46.png (328×1 px, 107 KB)

But for the enwiki or zhwiki without adding Accept-Language: zh-hant header, it shows up the content immediately.
https://en.wikipedia.org/api/rest_v1/page/related/Taiwan

Screenshot at Jan 05 09-27-06.png (376×1 px, 109 KB)

Screenshot at Jan 05 09-25-23.png (72×1 px, 31 KB)

This issue affects the response time on the Wikipedia Android app, which takes ~10 seconds to load the card and also blocks the remaining cards to be loaded. Please see the demo below:
https://www.youtube.com/watch?v=-J0Uzo53bhI

Event Timeline

@cooltey I tried to reproduce but was able to get like-results for both requests around 162ms response times. Could you please provide more information on how you are making requests?

zhwiki

image.png (56×759 px, 8 KB)

enwiki
image.png (48×676 px, 6 KB)

@cooltey I tried to reproduce but was able to get like-results for both requests around 162ms response times. Could you please provide more information on how you are making requests?

Hi @sdkim,
I have updated the description and looks like it will happen when making the request with Accept-Language, please see the screenshots in the description.

Thanks @cooltey , since the logic behind this is RESTbase I've tagged platform engineering

We don't store language variant transformations in RESTBase - they are done on the fly. Given that /page/related needs to compute summaries for many articles in variants, no wonder it can be slow.

Fixing it in RESTBase would be a VERY significant investment and generally goes against the idea of sunsetting RESTBase. If this is a very big deal we need to invest time into moving this endpoint into Core REST API and adding proper caching there.

RESTBase just seems to proxy to the Action API with no caching. It uses generator=search with a "morelike" operator in the search query.

RESTBase just seems to proxy to the Action API with no caching. It uses generator=search with a "morelike" operator in the search query.

Yeah. And then after It looks up summaries for the articles stored in cassandra, merges them in and returns the result. When summaries are coming from storage it's all very fast, but in case of language variants stored summaries need to be converted via parsoid, so it becomes very slow.

What is the user impact? Is this endpoint used by something?

What is the user impact? Is this endpoint used by something?

It is used by the Wikipedia Android and iOS app for two places.
One is the "Because you read" feed card on the Explore feed, and the other one is the footer of the mobile-html which shows related pages in "Read more" section.