Page MenuHomePhabricator

Don't set cookies in traffic layer for non-user facing domains (avoid false third-party cookie warning)
Closed, ResolvedPublic

Description

In the Javascript console (at Firefox browsers) presents error messages like "Request to access cookie or storage on “<URL>” was blocked because we are blocking all third-party storage access requests and content blocking is enabled."

This concerns:

Browsers cannot know that the Wikimedia, Wikipedia and the sister domains are operated by the same foundation. Maybe the domain usage should be changed for instance to en.wikivoyage.org/intake-analytics/, en.wikivoyage.org/commons/, en.wikivoyage.org/wikidata/ and so on. I think it is not a problem to create domain aliases.

Event Timeline

Those urls don't need to change. We just need to stop accidentally setting cookies on them. I'm 99% sure this is just coming from things like "Geo" and "LastVisit" being set on all traffic.

Several browsers already block these and that's working just fine. They are not intentionally set. At this point it's just noise seen by tech-savvy users looking at developer consoles, which would be good to fix to avoid confusion but otherwise is harmless.

ArielGlenn triaged this task as Medium priority.Sep 28 2020, 9:39 AM
Krinkle renamed this task from Blocking all third-party storage access requests to Don't set cookies in traffic layer for non-user facing domains (avoid false third-party cookie warning).Jan 26 2021, 5:02 PM
Krinkle updated the task description. (Show Details)
Krinkle updated the task description. (Show Details)
Krinkle removed a project: Performance-Team (Radar).
BBlack subscribed.

The swap of Traffic for Traffic-Icebox in this ticket's set of tags was based on a bulk action for all such tickets that haven't been updated in 6 months or more. This does not imply any human judgement about the validity or importance of the task, and is simply the first step in a larger task cleanup effort. Further manual triage and/or requests for updates will happen this month for all such tickets. For more detail, have a look at the extended explanation on the main page of Traffic-Icebox . Thank you!

I think this ticket could do with some more clarification: Which domains are affected? Should the Geo/LastVisit cookies be applied to a few domains explicitly rather than defaulting to every domain?

Is this related to "T255366: SameSite cookie issues"?

No, I don't think it is.

I think this ticket could do with some more clarification: Which domains are affected? Should the Geo/LastVisit cookies be applied to a few domains explicitly rather than defaulting to every domain?

Conceptually, we'd want to limit it to domains that are considered as wikis. In other words requests to top-level domains that represent wiki projects (*.wikipedia.org, *.wiktionary.org, etc) and subdomains of .wikimedia.org that relate to one of several hundred special-case wikis. In particular, not applying to the subdomains previously identified as "cache_misc", i.e. intake-analytics.wikimedia.org, upload.wikimedia.org, etc.

Traditionally, this distinction was afaik effectively just all of "cache_text", already handled differently at the DNS level. But with cache_misc having been merged into cache_text I imagine this is slightly harder to do now without a lot of duplication.

The good news is that 99% of non-wiki *.wikimedia.org that we're trying to avoid setting this cookie on, are never used within a browser context from within a wiki page. i.e. we don't make requests to phab, gerrit, grafana etc from tier-1 wiki traffic so it doesn't really matter a whole lot there, though principally it would be nice if VCL had a distinction between wiki traffic and non-wiki traffic and e.g. not add these cookies to responses of unrelated web applications.

One possibly short cut could be to just exclude the two domains we know are expected to be used in a wiki context. It's literally just the two in the task description:

  • upload.wikimedia.org (possibly easiest to do by making sure the cookie isn't applied to cache_upload, which afaik we still have a distinction for).
  • intake-analytics.wikimedia.org (could be explicitly excluded in VCL for cache_text frontend)

In searching for the relevant code in the Puppet repository, I notice that code for this cookie exists at both Varnish and ATS layer. Do we still need both? Was this previously being done by ats-tls or something?

Change 849184 had a related patch set uploaded (by BCornwall; author: BCornwall):

[operations/puppet@production] WIP: varnish: Conditionally set WMF-Last-Access cookie

https://gerrit.wikimedia.org/r/849184

Change 849184 merged by BCornwall:

[operations/puppet@production] varnish: Conditionally set WMF-Last-Access cookie

https://gerrit.wikimedia.org/r/849184

[~]$ curl -s -I https://en.wikipedia.org/ | grep Last-Access
set-cookie: WMF-Last-Access=03-Nov-2022;Path=/;HttpOnly;secure;Expires=Mon, 05 Dec 2022 12:00:00 GMT
set-cookie: WMF-Last-Access-Global=03-Nov-2022;Path=/;Domain=.wikipedia.org;HttpOnly;secure;Expires=Mon, 05 Dec 2022 12:00:00 GMT
[~]$ curl -s -I https://doc.wikimedia.org/ | grep Last-Access
[~]$ curl -s -I https://upload.wikimedia.org/ | grep Last-Access
[~]$

\o/

Gerrit kept reporting org.apache.http.client.protocol.ResponseProcessCookies : Invalid cookie header for WMF-Last-Access errors T273605. The last one was on Nov 3rd 17:03 UTC. And revisiting that task it was once filed in 2015 as T98396.

Thank you for getting rid of the log spam!

Change 886840 had a related patch set uploaded (by Vgutierrez; author: Vgutierrez):

[operations/puppet@production] varnish: Remove upload.wm.o test from text test

https://gerrit.wikimedia.org/r/886840

Change 886840 merged by BCornwall:

[operations/puppet@production] varnish: Remove upload.wm.o test from text test

https://gerrit.wikimedia.org/r/886840

Change 889846 had a related patch set uploaded (by BCornwall; author: BCornwall):

[operations/puppet@production] varnish: Check upload.wm.o for analytics cookies

https://gerrit.wikimedia.org/r/889846

Change 889846 abandoned by BCornwall:

[operations/puppet@production] varnish: Check upload.wm.o for analytics cookies

Reason:

varnish/files/tests/upload/10-frontend-deliver.vtc already validates that upload.wm.o isn't setting any kind of cookie: expect resp.http.Set-Cookie == <undef>

https://gerrit.wikimedia.org/r/889846