Page MenuHomePhabricator

Remove "Referrer-Policy: no-referrer" from WMF WordPress sites
Open, Needs TriagePublic

Description

Background

The default Referrer Policy in all major borwsers is strict-origin-when-cross-origin, which means only the domain name is in the referral data when navigating between websites.

Historically, browsers defaulted to sharing the URL in the Referer header. This has not been the case for several years now (since 2019, 2020, or 2021 depending on the browser).

https://en.wikipedia.org/wiki/HTTP_referer
https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Referrer-Policy

Problem

When people follow a link on a post on Wikimedia Diff blog, no referral information is sent. The same problem happens for Wikimedia Techblog, and wikimediafoundation.org. This means our blogs are absent from referral data on other websites, as well as from our own referral data such as for Wikipedia and Commons pageviews.

I noticed this when researching T422584, through the suspicious absence of diff.wikimedia.org and such in our referral data. I traced this down to a set of 2023 changes to our WordPress configuration:

It seems security scanner is unaware of the present-day default in browsers. Although to its credit, the report is neutral on what you set it to. It just wants you to be intentional. It actually points to a blog post where expert Scott Helme recommends various good choices. He does not recommend no-referrer.

When we apply the same security scanner to Wikipedia, the report makes a similar scary-red-text claim. It seems the scanner is also unaware of the meta tag <meta name="referrer" content="origin-when-cross-origin"> that MediaWiki already sets. As said, this is redundant in modern browsers. For Wikipedia, we set it redundantly to the default protect readers in older browsers from leaking what articles are being read, because the URL of a Wikipedia is in itself meaningful personal information.

Proposal

Remove the override that currenty sets the restrictive no-referrer via inc/http-headers.php in https://github.com/wikimedia/wikimedia-wordpress-security-plugin.

Keep the remaining instruction in incl/csp.php which matches the browser default strict-origin-when-cross-origin, the same way as we do on Wikipedia.

  • Update the security plugin.
  • Verify deployment on diff.wikimedia.org and wikimediafoundation.org.

Event Timeline

Hey @Krinkle do you think we need to have the Security folks take a look at this change? As you note we went through a decent amount of work to make Diff and the Foundation WordPress installs more secure. If I'm understanding you there's little risk here, but I want to be cautious.

I'm writing an essay on the topic, but in a nut shell: The unintended use of no-referrer and other mechanisms like it has fueled industry dominance of Google and others, making the broader ecosystem of websites like ours incorrectly believe that they are dependent on Google for 90% of their traffic. What has happened is that tonnes and tonnes of websites have blindly followed security theatre around Referrer-Policy, without understanding what this does or what choices you have. The consequence is that all of these websites have erased their own footprints, thus making it seem that all traffic comes from Google. It is a true tragedy of the commons, because whenever one site does this, they still see others, but others no longer see them, and over time, nobody sees anybody, except we all still see the handful of sites in our analytics don't follow do this: big search engines and social media sites. There are other cases I'm researching where this is even worse.

For Referral-Policy it's a bit surprising to see that no-referrer took off because nobody actually recommended or argued for it. Not the free "security scanners", not MDN docs, not experts like Scott Helme. When you hire a consultant to secure your site, they are likely to start with the most paranoid settings possible. It's easier to turn things off than to have to take responsibility for leaving something on; even if turning it off kills the site in question, that's not the consultant's problem because they're primarily protecting themselves, not the site. The result is hundreds of thousands of websites who voluntarily blackholed their own traffic attribution without understanding what that means.

Unless someone understands the specific settings, the impact is not realized until a later time. That time is now. There is a cross-discipline gap between security, performance, and SEO. Sometimes these align, but often they don't. When they don't, it's important to understand how the overall thing works and make a decision that actually works.

Hey @Krinkle do you think we need to have the Security folks take a look at this change?

I think matching what our most sensitive sites use in production (en.wikipedia.org, uk.wikipedia.org, zh.wikipedia.org), what experts like Scott Helme recommend, and what privacy advocates like WebKit and Mozilla use as the strict default in their browser, should be good enough for our blogs.

I think matching what our most sensitive sites use in production (en.wikipedia.org, uk.wikipedia.org, zh.wikipedia.org), what experts like Scott Helme recommend, and what privacy advocates like WebKit and Mozilla use as the strict default in their browser, should be good enough for our blogs.

Ok, that makes sense. I'll work on getting the security plugin updated.

I'm writing an essay on the topic,

Maybe a blog post? On Diff? :)