Page MenuHomePhabricator

Can/should *.wmflabs.org be added to the default-src Content Security Policy?
Closed, DeclinedPublic

Description

I'm creating a gadget (T223776) that queries an endpoint at tools.wmflabs.org, but this seems to violate the current CSP:

[Report Only] Refused to connect to 'https://tools.wmflabs.org/externalitemsuggester/search?property=P214&value=test' because it violates the following Content Security Policy directive: "default-src 'self' data: blob: upload.wikimedia.org https://commons.wikimedia.org meta.wikimedia.org *.wikimedia.org *.wikipedia.org *.wikinews.org *.wiktionary.org *.wikibooks.org *.wikiversity.org *.wikisource.org wikisource.org *.wikiquote.org *.wikidata.org *.wikivoyage.org *.mediawiki.org wikimedia.org". Note that 'connect-src' was not explicitly set, so 'default-src' is used as a fallback.

While it's report only, it's a bit worrying. Can *.wmflabs.org be added to the CSP? Looking at T130748, it seems like it was included at some point, but was then removed? I didn't manage to find a discussion about it.

Related issue: T220475

Event Timeline

Can *.wmflabs.org be added to the CSP?

See also https://phabricator.wikimedia.org/T207900#4846582

This task should probably be closed and discussed in T28508: Content Security Policy (CSP) instead, if there are still open topics? :)

*.wmflabs.org should certainly not be added, not sure about tools.wmflabs.org.

*.wmflabs.org should certainly not be added, not sure about tools.wmflabs.org.

Why not? Can you elaborate? For scripts I agree that these shouldn't be loaded from *.wmflabs.org , but why not for data? All wmflabs.org traffic is handled by proxies, right? That should take care of the privacy concerns.

*.wmflabs.org should certainly not be added, not sure about tools.wmflabs.org.

Why not? Can you elaborate? For scripts I agree that these shouldn't be loaded from *.wmflabs.org , but why not for data? All wmflabs.org traffic is handled by proxies, right? That should take care of the privacy concerns.

No, only some parts of wmflabs.org are hosted by trusted proxies. Even those parts I'm not sure about - they probably don't reveal IP to origin but what about other stuff that needs to be treated as private, like UAs?

No, only some parts of wmflabs.org are hosted by trusted proxies.

Do you have some examples? I was under the impression all of it was through https://wikitech.wikimedia.org/wiki/Help:Proxy . Anyway, everything running on wmflabs is covered by https://wikitech.wikimedia.org/wiki/Wikitech:Cloud_Services_Terms_of_use#What_can_and_can%E2%80%99t_be_done_with_user_information? and the general https://foundation.wikimedia.org/wiki/Privacy_policy . This covers all the privacy concerns. Do you have other concerns? You didn't elaborate on why *.wmflabs.org shouldn't be added so it's unclear to me what your concerns are.

An example might be utrs.wmflabs.org, which says "By using this project, you agree that the volunteer administrators of this project will have access to any data you submit. This can include your IP address, your username/password combination for accounts created in Labs services, and any other information that you send.". The general privacy policy does not apply on wmflabs.org, see the wikitech page you linked, which says (in a warning that's mandatory to include) "these terms regarding use of your data expressly override the Wikimedia Foundation's Privacy Policy".

No, only some parts of wmflabs.org are hosted by trusted proxies.

Do you have some examples? I was under the impression all of it was through https://wikitech.wikimedia.org/wiki/Help:Proxy . Anyway, everything running on wmflabs is covered by https://wikitech.wikimedia.org/wiki/Wikitech:Cloud_Services_Terms_of_use#What_can_and_can%E2%80%99t_be_done_with_user_information? and the general https://foundation.wikimedia.org/wiki/Privacy_policy . This covers all the privacy concerns. Do you have other concerns? You didn't elaborate on why *.wmflabs.org shouldn't be added so it's unclear to me what your concerns are.

Anything with a floating IP (excluding the trusted proxies themselves). The most obvious one that comes to mind is deployment-prep, but there are many others - you can run a script with novaobserver credentials across all projects to find a list of instances that are assigned floating IPs (note some projects may have floating IPs that they have not yet allocated to instances). There are some things that go through proxies, but it is a subset of *.wmflabs.org.
As far as I know that Cloud ToU is insufficient for the handling of production user's private data. Preventing/discouraging the storage of such data is not enough - it needs to be actually unavailable for even processing as part of a stream by users who have not signed NDAs etc.
It should not be added because (a) of the obvious privacy concerns, and (b) to prevent production sites relying on non-production sites for functionality.

As far as I know that Cloud ToU is insufficient for the handling of production user's private data. Preventing/discouraging the storage of such data is not enough - it needs to be actually unavailable for even processing as part of a stream by users who have not signed NDAs etc.

I see room for improvement here too. Improve terms of usage for cloud and maybe require projects with public ip to have all it's members in the NDA group? This kind of improvements should probably go into a new task.

It should not be added because (a) of the obvious privacy concerns, and (b) to prevent production sites relying on non-production sites for functionality.

For (b) that's already the case.

Fact is that our sites rely on the cloud infrastructure for all sorts of functions. The reason I ended up here was that one of the tools T227162 ("Good Pictures" button in Commons Categories blocked by content security policy). Gadgets using data from the Wikimedia cloud are not a bad thing and shouldn't be prevented by some technical measure.

[...] require projects with public ip to have all it's members in the NDA group? This kind of improvements should probably go into a new task.

This idea is pretty much fundamentally opposed to the original idea of labs, and strikes me as a gigantic step in the opposite direction of where it should be going. Adding extra restrictions just so we can process data that we shouldn't be able to see in the first place? Note that, off the top of my head at least, I can't think of a technical reason someone couldn't point a domain under *.wmflabs.org to a completely external server anyway.
It would be safer to whitelist just specific subdomains that are trustworthy - or maybe make a new domain which such privileged services may run under. Like *.prodtrusted.wmflabs.org, with a tools.prodtrusted.wmflabs.org subdomain for tools and so on. Only projects where the admins all have NDAs and are trusted would have a subdomain there, and we already have arbitrary zone creation disabled IIRC so wouldn't have to worry about that either.

For (b) that's already the case.

Fact is that our sites rely on the cloud infrastructure for all sorts of functions. The reason I ended up here was that one of the tools T227162 ("Good Pictures" button in Commons Categories blocked by content security policy). Gadgets using data from the Wikimedia cloud are not a bad thing and shouldn't be prevented by some technical measure.

Maybe it works right now by accident but allowing it under CSP would be officially condoning this practice and permitting it to work in future.

chasemp triaged this task as Medium priority.Dec 9 2019, 4:08 PM
Bawolff subscribed.

The plan is to have a preference, where users can adjust their CSP header (T208188). We will not be adding wmflabs.org to the allow list by default though but users will be able to opt into it.