Page MenuHomePhabricator

RESTbase cached morelike endpoint
Closed, ResolvedPublic

Description

As a reader, I want Related Articles / Read More to load quickly, so that I can more easily find other neat stuff at a glance.

Acceptance criteria

  • Card data in Related Articles / Read More is obtained in 250ms or less at production web scale from a RESTbase endpoint.
  • URL format is predictable such that web, Android, and iOS can all get consistent responses.

Note, the discussion and ensuing work in T124225: PageImages should never return non-free images may introduce an extra image field (e.g., "freely_licensed_fallback") in the response. However, it's conceivable no such additional field will be needed.

Event Timeline

dr0ptp4kt created this task.Feb 5 2016, 3:10 PM
dr0ptp4kt raised the priority of this task from to Needs Triage.
dr0ptp4kt updated the task description. (Show Details)
dr0ptp4kt added subscribers: dr0ptp4kt, Dbrant, EBernhardson.
Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald TranscriptFeb 5 2016, 3:10 PM
dr0ptp4kt renamed this task from Cache Related Articles API hits to Edge Side Cache Related Articles API Responses.Feb 5 2016, 3:11 PM
dr0ptp4kt set Security to None.
dr0ptp4kt renamed this task from Edge Side Cache Related Articles API Responses to Edge Side Cache Related Articles Card Data Responses.Feb 5 2016, 3:41 PM
dr0ptp4kt updated the task description. (Show Details)
dr0ptp4kt added subscribers: Nirzar, GWicke, Niedzielski and 8 others.
dr0ptp4kt triaged this task as Normal priority.Feb 8 2016, 2:24 PM

We met in person and agreed to the following:

  • The web will use smaxage as a bridge between now and the future when a RESTbase endpoint is available. @dr0ptp4kt to file a new task for this.
  • The apps teams are interested in migrating to a RESTbase endpoint once it is available.
  • The Reading teams of web, Android, and iOS will provide the URLs they use for construction of Read More (for the sake of completeness, iOS should also provide what it uses for the 20-row version in the forthcoming Wikipedia Mobile 5 release). These URLs should speak to pixel density variance if it has any bearing on the URL format for the given platform.
  • The parameters will be merged and normalized such that one consistent response can be obtained that will be capable of serving these major channels.
  • @dr0ptp4kt to check on TTL. DONE: @EBernhardson confirmed it is presently 0, but will become 24 hours in a forthcoming deployment.

@bearND & @Mholloway (CC @Niedzielski and @Dbrant), @BGerstle-WMF would you please provide your URLs in a comment here? The web URL is of the following form:

https://en.wikipedia.org/w/api.php?action=query
&format=json
&formatversion=2
&prop=pageimages%7Cpageterms
&piprop=thumbnail
&pithumbsize=80
&wbptterms=description
&pilimit=3
&generator=search
&gsrsearch=morelike:Telopea_truncata
&gsrnamespace=0
&gsrlimit=3
dr0ptp4kt renamed this task from Edge Side Cache Related Articles Card Data Responses to RESTbase cached morelike endpoint.Feb 10 2016, 2:24 PM
dr0ptp4kt updated the task description. (Show Details)
dr0ptp4kt updated the task description. (Show Details)Feb 10 2016, 2:43 PM
Mholloway added a comment.EditedFeb 10 2016, 2:51 PM

ANDROID:

https://en.wikipedia.org/w/api.php?action=query
&format=json
&prop=pageterms%7Cpageimages%7Cpageprops
&ppprop=mainpage%7Cdisambiguation
&wbptterms=description
&generator=search
&gsrsearch=morelike%3AFourth_generation_of_video_game_consoles
&gsrnamespace=0
&gsrwhat=text
&gsrinfo=
&gsrprop=redirecttitle
&gsrlimit=5
&piprop=thumbnail
&pithumbsize=640
&pilimit=5
&continue=

Note that we actually request 5 suggestions, and then display up to three after filtering out the Main Page, disambiguation pages, the current page, and pages with no thumbnail.

https://git.wikimedia.org/blob/apps%2Fandroid%2Fwikipedia.git/master/app%2Fsrc%2Fmain%2Fjava%2Forg%2Fwikipedia%2Fpage%2FSuggestionsTask.java

ETA: pithumbsize is the statically defined lead image width (which depends on the device config) / the device's display density.

GWicke raised the priority of this task from Normal to High.Feb 29 2016, 11:20 PM
GWicke added a comment.EditedMar 3 2016, 1:57 AM

@Pchelolo has created a PR for this at https://github.com/wikimedia/restbase/pull/533. Please have a look.

Notes:

  • Thumbs are 640px wide by default. This is the old issue from T66214 again. For now, clients will need to modify the URL if a different size is desired.
  • 5 results are returned by default.

@BGerstle-WMF @Fjalapeno @Mhurd would you please confirm the URL format for related pages calls (the 3 article version, not the 20 one) for iOS?

If we need a small tweak on @Pchelolo's patch, now is an excellent time to go for it.

If there are no objections, we will merge/deploy this next Monday, 2016-03-07

mobrovac closed this task as Resolved.Mar 7 2016, 7:04 PM
mobrovac assigned this task to Pchelolo.

This has been merged and deployed, so resolving. Re-open it in case refinements are needed.

bearND added a comment.Mar 9 2016, 5:53 PM

@Pchelolo and @mobrovac:
Sorry for the late reply. Could we make this 320px instead of 640px? See also the corresponding Android app patch at https://gerrit.wikimedia.org/r/276214

Change 276254 had a related patch set uploaded (by Ppchelko):
Enable varnish caching for related pages.

https://gerrit.wikimedia.org/r/276254

Code like this should* be able to rewrite the size:

var newSrc = src.replace(/\/\d+(px-[^\/]+)$/, '/640$1');

*: Assuming that image titles in src attributes cannot contain slashes, which I am not sure about. Current code blocks new uploads of files with slashes in the name, but it is not clear whether that has always been the case. Alternative approaches might be to split on slash & use a defined prefix length to identify the last part of the hash, but this is a bit brittle as well.

It seems desktop is pushing towards prioritising less-popular pages, cf. T128822: Add method for more like api query to not boost pages by popularity

Change 276254 merged by BBlack:
Enable varnish caching for related pages.

https://gerrit.wikimedia.org/r/276254