Page MenuHomePhabricator

Separate Cache-Control header for proxy and client
Open, MediumPublic

Description

MediaWiki has traditionally used the Cache-Control header to control the CDN (i.e. Squid reverse proxy), then the Cache-Control header for clients has been specified in Squid configuration. Specifically, when a certain URL regex matches, the Cache-Control header is stripped out and replaced with the configured header.

This is not ideal, as noted by Gabriel in a comment in the original code. It would be better if MediaWiki specified both headers in its response, so that the URL regex and client Cache-Control header does not need to be maintained in the CDN configuration. Originally, this would have required a Squid patch, but now that we are switching to Varnish, the feature can be implemented with VCL.

Specifically, MW should send a Client-Cache-Control header which Varnish will rewrite to Cache-Control as appropriate.

Details

Reference
bz48835

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 1:32 AM
bzimport set Reference to bz48835.
bzimport added a subscriber: Unknown Object (MLST).

We could do this the other way around and partially implement the semi-standard (semi because it's from the W3C, not IETF) Surrogate-Control header and leave Cache-Control intact for end-users. Fastly, for example, seems to be suggesting users to use this, so this may be a more compatible with the real world alternative.

(In reply to comment #1)

We could do this the other way around and partially implement the
semi-standard (semi because it's from the W3C, not IETF)
Surrogate-Control header and leave Cache-Control intact for
end-users. Fastly, for example, seems to be suggesting users
to use this, so this may be a more compatible with the real world
alternative.

Varnish uses the Cache-Control header in RFC2616_Ttl(), so I suppose it would be necessary to move the Cache-Control header out to some temporary pseudo-header in vcl_fetch, and to move it back into place in vcl_deliver. While the object is in the cache, Surrogate-Control would be copied into Cache-Control.

Support for pass mode would theoretically be simpler with Surrogate-Control.

Either way, there would have to be some backwards compatible handling in Varnish, to account for the progressive rollout of the new MW code. If Client-Cache-Control/Surrogate-Control is missing, Varnish would have to interpret Cache-Control in the old way.

On the MW side, OutputPage could provide an interface allowing configuration of the mapping of headers:

a) Old $wgUseSquid = false:

  • Client-Cache-Control -> Cache-Control
  • Surrogate-Control -> deleted

b) Old $wgUseSquid = true;

  • Surrogate-Control -> Cache-Control
  • Client-Cache-Control -> deleted

c) Surrogate-Control scheme:

  • Surrogate-Control -> Surrogate-Control
  • Client-Cache-Control -> Cache-Control

d) Client-Cache-Control scheme:

  • Surrogate-Control -> Cache-Control
  • Client-Cache-Control -> Client-Cache-Control

We currently don't have a Client-Cache-Control header at all and I don't think we should introduce it now with that name. Introducing just Surrogate-Control and doing the VCL ping-pong you mentioned sounds more sensible to me. We'd need a temporary header to store the client cache control, so we may end up using Client-Cache-Control internally inside VCL as an interim header but I don't see a reason for MediaWiki to use it. i.e. as I see it, the VCL could just be:

sub vcl_fetch {

if (beresp.http.Surrogate-Control) {
  set beresp.http.Client-Cache-Control = beresp.http.Cache-Control
  set beresp.http.Cache-Control = beresp.http.Surrogate-Control
  unset beresp.http.Surrogate-Control
}

}

sub vcl_deliver {

if (resp.http.Client-Cache-Control) {
  set resp.http.Cache-Control = resp.http.Client-Cache-Control
  unset resp.http.Client-Cache-Control
}

}

I don't see any handling that we do now to preserve the backwards compatibility you mentioned. Even if we do and I missed it, we can easily implement it as "else" clauses above, no?

It's a pity that Varnish doesn't natively support Surrogate-Control natively, indeed. Ironically, Squid 3 does in some form :) (so using it inside MediaWiki may be generally useful). I guess we could provide patches to Varnish for the long-term but VCL hacks seem viable in the short-term.

Note that the standard specifies a Surrogate-Capabilities request header to signal the capability to handle Surrogate-Control. We could set it in Varnish and MediaWiki could check for it, so you may avoid a configuration option.

Also note that the same Surrogate-Capabilities/Control mechanism could be also used to signal ESI back and forth (this is defined in the spec). Yuri has used X-Force-ESI (request) and X-Enable-ESI (response) for this purpose in the mobile caches for his ESI testing. We could deprecate those in favor of a unified Surrogate handling by core, especially while we move in the direction of doing ESI.

Immediate applications:

  • Normal page views (vcl_deliver in text-frontend.inc.vcl.erb)
  • Mobile page views (vcl_deliver in mobile-frontend.inc.vcl.erb)

Also, the use of Cache-Control in vcl_fetch in wikimedia.vcl.erb and in vcl_fetch in text-backend.inc.vcl.erb would have to be updated. Some care would have to be taken to ensure that MW does not accidentally send a public Surrogate-Control on responses with private data, where CC:private is currently sent and assumed to be sufficient. Maybe CC:private should override Surrogate-Control.

Aaron suggests that the feature could be used to allow private caching of resources delivered to logged-in users.

Note that, contrary to what I implied in comment #2, Surrogate-Control does not have the same format as Cache-Control. In particular http://www.w3.org/TR/edge-arch specifies the use of the no-store token and does not recognise no-cache or private.

bd808 edited projects, added Varnish; removed MediaWiki-Core-Team.Apr 8 2015, 11:42 PM
bd808 set Security to None.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 1 2015, 4:19 PM
faidon updated the task description. (Show Details)Oct 1 2015, 4:19 PM
faidon edited projects, added acl*sre-team, Traffic; removed Varnish.
Restricted Application added a subscriber: Matanya. · View Herald TranscriptOct 1 2015, 4:19 PM
BBlack added a subscriber: BBlack.May 5 2016, 4:33 PM

Note also that both CC and SC have grace-mode information as well.

In CC: our max-age is s-maxage with fallback to maxage, and our grace is stale-while-revalidate.
In SC: max-age's format is X[+Y], where X is our max-age, and Y is our optional grace window.

The plan that's starting to form in my head here, interrelated with T124954 (where we have issues with being able to consistently cap the overall life of objects as they traverse cache layers), is something like this:

  1. Start by blocking out the SC headers at our front edge: do not accept Surrogate-Capabilities from the outside, and do not forward any Surrogate-Control to the outside world.
  2. At least initially, block it from the applayer similar on the backend fetch side, so we can deal with inter-cache issues first.
  3. On reception of a response from an application backend, process Cache-Control/Expires (happens in Varnish already today, but we can also re-process...) to determine the effective max-age. Forward this on to upper caches as Surrogate-Control: max-age, after having applied our TTL caps to it, leaving CC unmolested.
  4. On inter-cache response reception, ignore CC and re-set the TTL that Varnish set (from CC/Expires) with Surrogate-Control max-age.
  5. Eventually expand on the initial SC-creation in step (3), so that we have a compatibility mode where we translate from CC or accept applayer SC (but again, we apply our own policy restrictions for TTL capping, grace, etc). Then advertise this new information to app devs (MW, RB, etc) how they should start using SC to replace CC for Varnish-control purposes, and how CC should be handled in terms of controlling the outside world (e.g. what does s-maxage really mean given HTTPS and given it's not controlling our own caches?).
ema added a subscriber: ema.May 27 2016, 2:52 PM
Gilles added a subscriber: Gilles.Jun 8 2016, 12:58 PM
ema moved this task from Triage to Caching on the Traffic board.Sep 30 2016, 2:33 PM