Page MenuHomePhabricator

Determine if hovercards should use RESTBase summary endpoint
Closed, ResolvedPublic

Description

Because Hovercards essentially display the content of the summary end point, it was identified as a good candidate for this evaluation.

The goals of this evaluation are:

  1. Determine if RESTBase scaling is sufficient for web traffic
  2. Compare statistics such as latency and response time vs using the MediaWiki API
  3. Determine if using RESTBase for APIs is a valid strategy for Desktop/Mobile web

To support this evaluation, it should be possible to configure Hovercards to use a REST endpoint as the source.
Add a config variable that when true uses the REST api endpoint, when false uses the mediawiki api (as it does currently). The rest api endpoint does not need to fallback to the mediawiki api when REST is not responding. The scope of this task does not include enabling this config variable to true and it should default to use non-REST endpoints.

Relevant links:
https://www.mediawiki.org/wiki/RESTBase
https://en.wikipedia.org/api/rest_v1/#!/Page_content/get_page_summary_title

Event Timeline

ovasileva triaged this task as Medium priority.Sep 21 2016, 4:49 PM
ovasileva moved this task from Incoming to Needs Prioritization on the Web-Team-Backlog board.
ovasileva raised the priority of this task from Medium to High.Oct 7 2016, 12:59 PM

Lazy loading references would also be a good testing ground for this - T146396
What's stopping us from trying one of these? (ie. why is this card in the column needs analysis?)

Lazy loading references would also be a good testing ground for this - T146396
What's stopping us from trying one of these? (ie. why is this card in the column needs analysis?)

@Jdlrobson not sure - @ovasileva do you need any information from services or anything else?

ovasileva renamed this task from A/B test the RESTBase summary endpoint for hover cards to Allow hovercards to use RESTBase summary endpoint.Oct 26 2016, 11:38 AM
ovasileva updated the task description. (Show Details)

Determine if RESTBase scaling is sufficient for web traffic

REST api responses are cached in Varnish, with a really high hit ratio (only about 100 req/s reach RESTBase out of ~4000 req/s that hit rest API) so scaling shouldn't be a problem here, most of the load will be handled by Varnish.

Compare statistics such as latency and response time vs using the MediaWiki API

I've done some simple benchmarking from within the production cluser to compare the latency of the cache miss:

  • MW API:
ab -n 1000 -H 'Host: en.wikipedia.org' 'http://api.svc.eqiad.wmnet/w/api.php?action=query&format=json&prop=info%7Cextracts%7Cpageimages%7Crevisions&formatversion=2&redirects=true&exintro=true&exchars=525&explaintext=true&piprop=thumbnail&pithumbsize=600&rvprop=timestamp&titles=Roan_antelope&smaxage=300&maxage=300&uselang=content'

Concurrency Level:      1
Time taken for tests:   40.116 seconds
Complete requests:      1000
Failed requests:        0
Total transferred:      1788014 bytes
HTML transferred:       1160000 bytes
Requests per second:    24.93 [#/sec] (mean)
Time per request:       40.116 [ms] (mean)
Time per request:       40.116 [ms] (mean, across all concurrent requests)
Transfer rate:          43.53 [Kbytes/sec] received
  • RESTBase API
ab -n 1000 http://restbase.svc.eqiad.wmnet:7231/en.wikipedia.org/v1/page/summary/Roan_antelope

Concurrency Level:      1
Time taken for tests:   8.594 seconds
Complete requests:      1000
Failed requests:        0
Total transferred:      1846000 bytes
HTML transferred:       836000 bytes
Requests per second:    116.35 [#/sec] (mean)
Time per request:       8.594 [ms] (mean)
Time per request:       8.594 [ms] (mean, across all concurrent requests)
Transfer rate:          209.75 [Kbytes/sec] received

So, switching from MW API to RESTBase will improve latency for the cache miss in 5 times. Although, the more important aspect is the cache hit ratio - for rest API it should be way higher, then for MW API, because summary is a long-term cached and purged endpoint, used by other use-cases too. Also, the call to MW API might fragment on the users browser depending on the order of the query parameters. So switch to RESTBase should gain quite a significant latency win.

Thank you @Pchelolo for the awesome comparison. Also of note is that fact that cache misses for RB's summary endpoint are much less likely to happen when compared to the overall cache miss we have (100 reqs out of 4000) because the endpoint is automagically updated and used throughout the system.

ovasileva renamed this task from Allow hovercards to use RESTBase summary endpoint to Determine if hovercards should use RESTBase summary endpoint.Jan 18 2017, 1:07 AM
ovasileva claimed this task.

Hovercards will use RESTBase summary endpoint. Change to RESTbase will be tracked in T123445: Add support for RESTBase endpoint consumption

Notes from today's meeting about using the RESTBase page summary endpoint in Page Previews can be found here: https://etherpad.wikimedia.org/p/popups-restbase/timeslider#1582