Page MenuHomePhabricator

Bust intermediate caching when Set-Cookie headers are part of the response
Closed, ResolvedPublic

Description

The Wikimedia production cluster uses custom configuration in its Varnish caching layer (rOPUPb73afc51fedb: Make sure Set-Cookie responses are not cacheable, and log violations) to to protect against inadvertently caching responses from MediaWiki which carry Set-Cookie headers. This is a protection against an intermediate cache accidentally delivering a cookie header that may be user specific to multiple clients.

MediaWiki itself should attempt to provide such protection for the benefit of all deployments.

Event Timeline

Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald TranscriptFeb 24 2016, 5:19 PM
bd808 triaged this task as Normal priority.
Restricted Application added a project: User-bd808. · View Herald TranscriptFeb 24 2016, 5:19 PM

Change 272937 had a related patch set uploaded (by BryanDavis):
Guard against allowing caching when cookies are present

https://gerrit.wikimedia.org/r/272937

bd808 added a subscriber: Tgr.Feb 24 2016, 5:24 PM

Copying discussion from gerrit:

Patch Set 2:
Not sure it is a good idea to this for any cookie. What if an endpoint just sets some cookie deterministically? (E.g. I think MobileFrontend has some kind of special page to handle the "opt out from mobile" link.) Maybe only warn if something in OutputPage::getVaryCookies() gets set?
Also, not sure if there is anything wrong with setting cookies on the third branch ("We do want clients to cache if they can...")

Copying discussion from gerrit:

Patch Set 2:

Not sure it is a good idea to this for any cookie

The intent here is to mirror the WMF production cookie cache avoidance Varnish configuration into MediaWiki's code [0] for the benefit of site operators who aren't behind the WMF cache system. I really don't think there is such a thing as a "deterministic" cookie. There may be GET/POST code paths that always send a Set-Cookie header but the act of doing so is a state changing operation for the client.
Choosing to flip to the no-cache headers may not be the best thing we can do however. Fundamentally we just want to tell any intermediate cache to ignore the response. There is nothing wrong with the actual user-agent storing Set-Cookie responses in local cache as far as I know.
[0]: https://phabricator.wikimedia.org/rOPUPb73afc51fedbe1119f4af35c7d9ab331f78357db

@Tgr may be correct that what we really want to do for a response that is marked as cachable (OutputPage->mEnableClientCache === true) which also has Set-Cookie headers is emit the "private caching" headers that are currently sent when OutputPage::haveCacheVaryCookies() or OutputPage::isPrintable() are true. Currently the cache headers added by that code path are:

Expires: Thu, 01 Jan 1970 00:00:00
Cache-Control: private, must-revalidate, max-age=0

I think that this behavior would would match the logic of the failsafe guard condition that is in the VCL config ((beresp.ttl > 0s && beresp.http.Set-Cookie)) based on my reading of the beresp.ttl documentation. The max-age=0 stanza of the Cache-Control header should cause Varnish to set bresp.ttl = 0 if I am understanding the documentation correctly. We may want to amend the existing Cache-Control header to also set s-maxage=0.

I'd like to hear thoughts from @BBlack and others on that.

What @bd808 says directly above all makes sense. Note that given the presence of max-age=0, adding s-maxage=0 should be redundant according to the standards. It may be that we duplicate that in some current CC header emissions just for bug-workaround reasons outside of our infra, though.

https://gerrit.wikimedia.org/r/272937 updated to take the existing branch of emitting headers that let the client cache but not (properly functioning) intermediate caches. When cookies are present, the headers for the response should include:

Expires: Thu, 01 Jan 1970 00:00:00
Cache-Control: private, must-revalidate, max-age=0

Change 272937 merged by jenkins-bot:
Guard against allowing intermediate caching when cookies are present

https://gerrit.wikimedia.org/r/272937

Change 275857 had a related patch set uploaded (by Anomie):
Use header_register_callback to avoid caching responses with Set-Cookie headers

https://gerrit.wikimedia.org/r/275857

Change 275857 merged by jenkins-bot:
Use header_register_callback to avoid caching responses with Set-Cookie headers

https://gerrit.wikimedia.org/r/275857

bd808 closed this task as Resolved.Mar 10 2016, 4:33 PM
bd808 moved this task from Needs Review/Feedback to Done on the User-bd808 board.
bd808 moved this task from Done to Archive on the User-bd808 board.Mar 14 2016, 5:55 PM