A header like Content-Security-Policy: default-src 'self' 'unsafe-inline' data: *://*.wmflabs.org:* *://*.wikimedia.org:* *://*.mediawiki.org:* *://*.wikipedia.org:* ... could be injected into all responses at the HTTP proxy to provide browsers with a whitelist of acceptable content origins. The ... would need to list all other Wikimedia TLDs that are acceptable and would probably end up being a pretty long list. I'd be interested to hear what @csteipp thinks about the feasibility of such a solution.
- Mentioned In
- T199652: Tool 'wp-world' gives 404s
T250922: MoeData causes visiting browser to load data from 3rd party sites
T250758: Get MoeData up and running on Toolforge
T223840: Can/should *.wmflabs.org be added to the default-src Content Security Policy?
T172065: Hunt for Toolforge tools that load resources from third party sites
T135963: Add support for Content-Security-Policy (CSP) headers in MediaWiki
T128409: Detect tools.wmflabs.org tools which are HTTP-only
- Mentioned Here
We can also choose to monitor instead of blocking, by using https://www.w3.org/TR/CSP/#content-security-policy-report-only-header-field . We can then have a relatively strict number of wildcard domains, and filter the remaining ones server-side. It has the risk of DDOSsing whichever host is receiving the reports, though, as every (?) blocked (or would-be blocked) request triggers a POST there.
After re-reading my own post, I realized I'm really suggesting two distinct things:
- actively using the 'report' option to
- warn the user, and
- tune the filter, and
- soft-blocking, at least initially.
The report option is essential to help users figure out why something isn't working (e.g. by logging to error.log) and for us to notice it if we forgot to whitelist an essential domain.
The soft-blocking is important for existing tools, to prevent them from breaking without giving the tool author a chance to fix it.
I agree that a rollout of this should have a reasonable period of using report-uri and the report-only mode before being globally enforced. After enforcement it may still be useful to have the report-uri endpoint enabled along with some reporting system that developers can check to find problems.
Using a backend like Sentry for collecting the violations might be a useful way to de-duplicate and display the errors. @Tgr has put quite a bit of work into learning how to provision and configure Sentry, but we haven't fully deployed it in beta cluster or production to my knowledge yet. We really won't know until we turn on some sort of reporting what kind of volume the endpoint would receive. The deployment plan should take that into account and include some "dark launch" time to collect initial data and determine how to proceed with the actual data collection.
Sentry is deployed in the beta cluster (deployment-sentry2) but does not have sane user management yet. I don't think there is anything blocking its use as an endpoint if you just want to see the volume, although a simple web server with hit counting could serve that purpose just as well.
For collecting reports, there's also https://gerrit.wikimedia.org/r/253969 (core patch to both set headers, and collect reports). Whenever I get some free time, I'm hoping to get that merged, and use mediawiki to collect csp reports from a couple domains.
Comments on https://www.mediawiki.org/wiki/Requests_for_comment/Content-Security-Policy are welcome too.
However, if Sentry is able to collect and present those, and can scale to production levels, I'd be happy to run Sentry instead of supporting our own code. I didn't even think of Sentry when I was looking at options.
Sentry has some custom code for interpreting CSP reports (ticket, code); no idea what that means in practice (at a glance mostly seems to deal with translating the report JSON into human-readable text).
While chatting with @ZhouZ I had an idea for an opt-in system:
- Tool X needs the user to interact with a 3rd-party service directly
- Tool X directs the user to some trusted central application for confirmation
- The central app presents the user with the actions that Tool X wants allowed
- If the user consents, the central app gives the user a cookie that will instruct the proxy to augment/exclude the CSP header from future pages served by Tool X
- The central app redirects the user back to Tool X for further interactions
There are some details that would need to be worked out for this. The cookie would need some sort of cryptographic security so that it could not be spoofed by Tool X itself. Whether the CSP header should be dropped entirely or augmented would also need to be worked out. Dropping would be easier, but may not offer as much end-user protection as we would like. A full implementation might work something like the OAuth system where the opt-in is only allowed for tools that have previously requested a certain set of exclusions and the need for that has been approved by some group of arbiters.
Such a system would probably reduce usability of the tool for users who use incognito mode and break it for those who do not accept cookies at all. Not sure if that's a net privacy win.
(You don't have that problem with OAuth because there the untrusted server is requesting access to a specific user account so it cannot lie about the identity of the user. But in this case you need the user to prove their identity, at least to the extent of grant group membership.)
I do not immediately see how OAuth/OAuth2 would allow the proxy server to determine if a user had opted into third-party browser interactions for a given tool. Can you explain how you imagine this might work?
I may have made things less clear rather than more clear when I mentioned OAuth. I was merely imagining a workflow where a tool maintainer would have to create a request for CSP exemption that described the external resources they want the user to interact directly with and why. This request for exemption would then need to be reviewed and approved by some as yet unknown group of arbiters. When requesting that the CSP opt-out cookie be set the tool would need to reference the pre-approved request.
This is why the cookie would need to include some sort of cryptographic security (e.g. an HMAC signature that the proxy could verify). The secrets for securing the cookie payload would only be known to the management app and the proxy. Tools could try to mess with the cookie but they would not be able to create a valid payload that would change the CSP policy at the proxy.
Yes, if the opt-out was granted at a per-visitor level and tracked with a cookie then user-agents which routinely discard or never accept cookies would be at a disadvantage.
On the other hand if the policy exemptions are granted globally at the tool/uri level there is no way to ensure that proper consent for the removal of CSP protection has been given by the user.
I don't think that's necessary or helpful. If you can guarantee the integrity of the cookie (either by some sort of third-party cookie scheme, or by making the cookie HTTP-only and nuking cookie headers that would change it), there is no need for a cryptographic scheme. If you can't, it won't help against replay attacks - without securing cookies somehow, there is no shared secret between the browser and the management app (or the proxy), if it is the same user who gave the grant.
If you decide to go with the crypto cookie, I'd recommend using a JWT, with either an HS256 or ES256 signature. It's url-safe encoded so unlikely to get corrupted, and there are plenty of libraries out there so you don't have to try and get it right yourself.
The cookie method seems like the most straight forward solution to this problem, from my casual observation of this task.
Good points. If the cookie payload is only 'allow relaxed CSP for /tool-foo' then the same cookie would be usable for any client of tool-foo and trivially subject to a replay attack by a malicious tool-foo owner. If it was instead 'allow relaxed CSP for /tool-foo and client <salted-sha2-of-ip>' the replay attack surface would be reduced to a single IP + tool pair. Some embedded time limit could be added as well with the additional cost of user interface complexity. I'm not yet convinced that time limits is a highly needed feature, but I could be wrong. Adding the requesting IP to the grant token adds a new way for the grant to disappear abruptly when the device switches to a new IP causing possible end-user disruption.
Any scheme we come up with needs to be quick and simple to validate. The security proxy will need to check for and validate the authorization for each http request as the response is returned to the requesting user-agent. If we try to get fancy and insert a database lookup or a service call into this I'm afraid that it will become too slow/fragile to be generally useful.
Should mediawiki.org (without www) be allowed?
[Report Only] Refused to load the script 'https://mediawiki.org/w/api.php?callback=jQuery21403754722207000283_1539542064943&action=query&meta=siteinfo&siprop=general%7Cnamespaces&format=json&_=1539542064944' because it violates the following Content Security Policy directive: "default-src 'self' 'unsafe-eval' 'unsafe-inline' blob: data: filesystem: mediastream: *.wikibooks.org *.wikidata.org wikimedia.org *.wikimedia.org *.wikinews.org *.wikipedia.org *.wikiquote.org *.wikisource.org *.wikiversity.org *.wikivoyage.org *.wiktionary.org *.wmflabs.org wikimediafoundation.org *.mediawiki.org wss://tools.wmflabs.org". Note that 'script-src' was not explicitly set, so 'default-src' is used as a fallback.
We made a similar change in https://gerrit.wikimedia.org/r/422064 to allow wikimedia.org. There are two ways to look at this:
- www.mediawiki.org is the canonical URL. The example given returns a 301 Moved Permanently redirect to www.mediawiki.org. Strictly speaking it seems reasonable not to add the non-canonical URL to the allowed domain list.
- Historically, the 301 redirect has made use of the non-canonical domain transparent for both the end user and the developer. The intent of the CSP header is to protect users, not to punish them (or developers). Looking at things from this point of view supports adding the redirect domain to the allowed sources list.
I'm open to input from others on which way to lean here. Personally I am torn. I do not want to try and add every parking domain that may redirect to a valid wiki, but I also do not want to break things that are awkward, but safe from a data disclosure point of view.