Page MenuHomePhabricator

Enable caching for the Mobile Content Service's RESTBase public endpoints
Closed, ResolvedPublic

Description

We'd like to enable caching for the subset of RESTBase's endpoints pertaining to the Mobile-Content-Service on the text Varnishes. Specifically, the following URIs should be cached:

  • https://{domain}/api/rest_v1/page/mobile-html/{title}
  • https://{domain}/api/rest_v1/page/mobile-sections/{title}
  • https://{domain}/api/rest_v1/page/mobile-sections-lead/{title}
  • https://{domain}/api/rest_v1/page/mobile-sections-remaining/{title}
  • https://{domain}/api/rest_v1/page/mobile-text/{title}

PR 317 introduces the HTCP purging logic into RESTBase, where all of Mobile-Content-Service's endpoints are purged from the cache as soon as an update job for a specific title is received from the job runners.

Note: the purging logic has been tested in the Beta-Cluster-Infrastructure (cf T113235: Test HTCP purging in labs) and verified it works.

Event Timeline

mobrovac raised the priority of this task from to High.
mobrovac updated the task description. (Show Details)
mobrovac added subscribers: mobrovac, BBlack, GWicke and 3 others.
Restricted Application added subscribers: Matanya, Aklapper. · View Herald TranscriptSep 24 2015, 12:04 PM

Currently, the VCL in the text and mobile clusters (identical in this regard) aren't messing with the cache-control headers set by RESTBase 's responses, and in general the right way to do this is to change the cache-control headers in RB to allow caching rather than messing with particular URL regexes in VCL. However, a couple of points need exploring/fixing:

  1. Currently, the RB-related VCL implicitly assumes that RB responses are all no-cache, and thus does things like using backend_random instead of chashing, and setting req.hash_ignore_busy to avoid coalescing on cache updates. We'd want to switch back to the chashed inter-cache backend defs and remove hash_ignore_busy if we think RB is going to offer cacheable objects in general.
  2. For wiki content, the general way we handle caching and purging (for the standard /wiki/Foo content pages) is that the app sends a Cache-Control header with s-maxage specifying the lifetime in our varnish caches (which we cap at 30d max within varnish regardless of CC-header value), and then some custom logic in varnish regex-matches those article content URLs and sets the publicly-visible CC header to no-cache values, so that our PURGEs aren't thwarted by client-side or other intermediary caching outside the control of our PURGE traffic. If we stick with this sort of pattern, we'd probably have to do similar here, but really I'd rather not keep extending this bad pattern, as it adds needless complexity and interdependency with the applayer in the VCL. A better approach would be for us to define some custom response headers to handle these cases. For example, applayer stuff (both MW and RB) could send X-WMF-NoClientCache: 1 or something like that, which instructs varnish that while it will obey the standard Cache-Control for its own purposes, it will reset Cache-Control to no-cache values for external clients.
GWicke added a comment.EditedOct 1 2015, 3:27 PM

Currently, the RB-related VCL implicitly assumes that RB responses are all no-cache, and thus does things like using backend_random instead of chashing, and setting req.hash_ignore_busy to avoid coalescing on cache updates. We'd want to switch back to the chashed inter-cache backend defs and remove hash_ignore_busy if we think RB is going to offer cacheable objects in general.

Originally, the intention was to cache in the front-end caches only. I agree though that using the backends for caching could improve performance in Amsterdam, as currently RESTBase is only set up in eqiad and codfw. We have plans to look into having RESTBase instances on the edge by the end of the fiscal year, but we can always re-evaluate then. Typically, cacheable items will be backed by storage in RESTBase, so the penalty of a front-end cache miss is relatively low.

X-WMF-NoClientCache: 1

Another option might be to supply the public Cache-Control header as something like X-External-Cache-Control: ..., and then have Varnish assign that to Cache-Control if supplied. That way, apps can supply arbitrary Cache-Control headers to the client.

A basic PR is now available at https://github.com/wikimedia/restbase/pull/511. This does not set up purging yet, and only enables short-term caching limited to one hour.

Purging and long-term caching is going to follow. This is dependent on T126571, and should see at least some progress in the next weeks.

This was deployed yesterday afternoon SF time. Even with a short TTL of one hour & beta use only, peak request rates reaching RESTBase itself appear roughly halved:

This suggests a decent cache hit rate, which should further reduce time to first byte.

Pchelolo closed this task as Resolved.Mar 3 2016, 8:38 PM
Pchelolo claimed this task.

Mobile endpoints are cached in varnish and actively purged now (see T109742)