Page MenuHomePhabricator

Define policy aspects of CSP on wiki
Open, MediumPublic

Description

CSP is a technical solution to enforce privacy (and to a lesser extent) security standards. Technical measures should always be informed by political measures, and not the other way around.

There's various de-facto standards related to user scripts, and what is and is not acceptable to do in them. We should make these standards explicit

Issues at hand:

  • When is it acceptable to load external non-script data (Current de-facto: With user consent, for non-default scripts only)
  • When is it acceptable to load external scripts (Current de-facto: same as above, although I'd like to change it)
  • Is it acceptable to load data from toolforge without user consent, or in default gadgets (Current de-facto: Mostly no, but this is often not enforced)
    • A major sub-part of this is fonts from toolforge cdn. See also arguments at T209998

A secondary issue, might be, to what extent are scripts allowed to store user data/track users in cookies and whatnot (current de-facto: no rules as long as its not sent to an external party)

This needs to be discussed with various stake-holders

Event Timeline

When is it acceptable to load external scripts (Current de-facto: same as above, although I'd like to change it)

The rationale for my desire to make loading external scripts unacceptable, is that ideally I would like to ensure that there is an audit trail of all user JS.

It seems like one of the common reasons (esp. pathoschild) to load external JS, is so that it can be developed in git instead of on wikipages (often the github repo is then served from toolforge). Perhaps for this usecase, we could make a gerrit repo for such things, and have an extension that solely registers JS modules (which are not loaded anywhere, unless user specificly does in gadget), and have on wiki interface-admins have +2 on that repo.

Most common external script loads (ordered by number of CSP reports. Excluding things that appear spam e.g. adsbygoogle.js):

(It can be difficult to separate what is legit, from random spam. Full list at https://logstash.wikimedia.org/goto/2e4a66b2e1933aeb9291e3d824fde453 ).

Also on wikivoyage, see the semi-official https://tools.wmflabs.org/wikivoyage/w/data/en-articles.js used at MediaWiki:Kartographer.js (T244691)

sbassett triaged this task as Medium priority.Nov 25 2019, 3:54 PM
sbassett moved this task from Incoming to Watching on the Security-Team board.

Removing Application Security Reviews since there's no actual review request within this task.

When is it acceptable to load external non-script data (Current de-facto: With user consent, for non-default scripts only)

The "non-default scripts only" part isn't accurate, and even the "with user consent" part is iffy. Almost every wiki has default (or WMF-installed) scripts that load external data:

  • The UploadWizard on Commons supports importing images from Flickr via the Flickr API (with a privacy warning to the user).
  • Wikisources use the Google Vision API (via a Cloud Services proxy) to do OCR. This is a site script (Common.js) on many of them and a gadget on others.
  • Many Wikipedias (including English Wikipedia) still use WikiMiniAtlas (aka GeoHack) which is a site script with no consent/privacy warnings.
  • Wikivoyage maps have several features that load content or scripts from external sites (with a privacy warning).

Maps are indeed a notable on-going issue (WikiMiniAtlas etc.). Both staff and community have made a lot of progress toward defaulting those to use maps.wikimedia.org, which was in part put into production for exactly that reason. It's possible some wikis still have older version of the script that use the Cloud instance instead and/or still has Toolforge-hosted overlays enabled by default.

Aside from Maps though, all gadgets that involve third-parties are opt-in (afaik). Any attempts seen to enable gadgets or deploy common.js code that involves communicating with Cloud VPS or other non-prod infrastructure has been and will continue to be consistentcy shot down on-sight no questions asked, citing the WMF Privacy Policy. This typically happens a few times a year.

Access logs in Cloud VPS are accessible by user that are not under NDA and can also be compromised in ways we would not know about per se. That is by design and thus why it unlikely to become acceptable to send PII there without it being opt-in by the user.

The reason CSP hasn't been enforced yet is because these opt-ins are currently not machine readable. Part of T208188 is to make it possible for gadgets and/or individual users to store their opt-in flag in association with a particular third-party hostname or tool, so that the CSP rule can whitelist it. Until then enforcement is done manually by stewards, interface admins, and security/privacy staff. At times it is escalated to Legal whom can then do the edit via an office action if for some reason it is not advisable to perform a change as volunteer or individual staff (this hasn't happened in years, but mentioning it for completion).