Page MenuHomePhabricator

Evaluate the feasibility of cache invalidation for the action API
Open, MediumPublic

Description

The action API (api.php) supports caching for a user-defined period of time, but does not support cache invalidation, which is problematic for functionality that's under high load but might need to be updated immediately to reflect changes or remove vandalism, personal data etc. With the recent opensourcing of hashninja, maybe that can change. A possible approach to add partial cache invalidation could be:

  • install the Xkey (aka hashtwo/hashninja) Varnish module
  • every time the API is called, decide if it is a purgeable request (to cut down the number of URLs that need to be purged) - to be purgeable, it must invoke a single module, it must be about a single target (title, revision, file etc), the parameters must be in lexicographic order, for list parameters the content must be in lexicographic order, and it cannot have any non-enum parameter apart from the object identifier (e.g. title). Or maybe that could be relaxed a bit, depending on the hashninja performance and API usage stats.
  • on purgeable requests have the API module output a bunch of Xkey HTTP headers with an invalidation tag each, such as id:<page_id>
  • have MediaWiki send an appropriate purge request (a normal HTTP request with an Xkey-Purge HTTP header containing a list of invalidation tags) on every content update

Related Objects

Event Timeline

Tgr raised the priority of this task from to Needs Triage.
Tgr updated the task description. (Show Details)
Tgr added projects: MediaWiki-Action-API, Varnish.
Tgr subscribed.

@BBlack what do you think, is there a chance to deploy Xkey at Wikimedia? How could we measure or estimate its performance?

@bd808 any thoughts about API request stats? I assume the API usage stats are too well cleaned up to be of any help here, and we need to use the raw webrequests table - write an UDF for the "is purgeable?" logic, another one to extract the invalidation tag, and filter for purgeable GET API requests and count the number of requests per tag?

@bd808 any thoughts about API request stats? I assume the API usage stats are too well cleaned up to be of any help here, and we need to use the raw webrequests table - write an UDF for the "is purgeable?" logic, another one to extract the invalidation tag, and filter for purgeable GET API requests and count the number of requests per tag?

Are you looking for an estimate of how many requests would currently be purgable under the scheme you describe? I think that it should be possible to figure that out from the webrequests table or actually even from the raw api.log data on fluorine.

I think the more important metric is how many cached requests would have to be deleted on a single purge.

I've just created the above tasks based on our existing goal for next quarter of getting Varnish 4 up and running here (wasn't recorded in phab yet AFAIK), which is a pre-requisite to getting XKey going. Note we're also potentially wanting this for Thumbor purging ( T121391 ).

Keep in mind even the Varnish4 goal listed as a blocker here isn't enough to unblock this fully, as the goal there is just to get it running at all for a single production cluster as a trial - there's still an additional step beyond that of moving to Varnish 4 for all the clusters (critically in this case, the "text" cluster).

A little brainstorming on the API side of things:

An API query request generically could depend on a number of pages:

  • The pages listed in the 'titles', 'pageids', or 'revids' parameters, if any. This is up to 500 pages.
    • If automatic redirect resolution is used, then the targets of any redirects in there too.
  • If a generator is used, the pages output by the generator. This is up to 5000 pages.
    • If automatic redirect resolution is used, then the targets of any redirects in there too.
  • For 'list' modules, the page(s) specified in their parameters, if any. For example, the 'cmtitle' for list=categorymembers. Offhand, I think this is 0 or 1 per module.
  • For some 'list' modules and backlinks-style 'prop's, we might have to include the pages actually output too (up to 5000 pages per module). Others might already be purged by MediaWiki, e.g. when a template is edited all pages transcluding it get purged, or when a page is edited any categories that are added or removed get purged.
  • Some are probably just not practically cacheable, for example list=recentchanges on an active wiki would need to be purged on every edit (unless rcstart/rcend specify "older than timestamp T", anyway).

For non-query modules, we'd likely have to look at it on a case-by-case basis. Parse and expandtemplates, for example, could probably be XKeyed, while many others wouldn't work.

Some of Gergő's proposed limitations would cut down on the total number of XKeys per request as well as reducing the number of URLs subject to having XKeys at all, at the cost of limiting the number of API requests that are usable with this scheme. We'd probably want to indicate to the client somehow that the request was XKeyable so developers don't have to guess.

Or another alternative would be to make a "pageinfo" action oriented towards getting information about a single page, with less fine-grained options as to what exactly can be queried (e.g. the equivalent to prop=revisions wouldn't have an "rvprop" parameter). That may need some thought to avoid code duplication between query and "pageinfo", though.

chasemp triaged this task as Medium priority.May 5 2016, 8:44 PM

The swap of Traffic for Traffic-Icebox in this ticket's set of tags was based on a bulk action for all such tickets that haven't been updated in 6 months or more. This does not imply any human judgement about the validity or importance of the task, and is simply the first step in a larger task cleanup effort. Further manual triage and/or requests for updates will happen this month for all such tickets. For more detail, have a look at the extended explanation on the main page of Traffic-Icebox . Thank you!