Page MenuHomePhabricator

Ensure that Terms of Use document restrictions on third-party web interactions
Open, MediumPublic

Description

My current understanding of general policy is that Cloud hosted services should not force the end user's browser to interact with third-party services without explicit opt-in approval. The documentation at https://wikitech.wikimedia.org/wiki/Wikitech:Cloud_Services_Terms_of_use however does not clearly state that loading web content (e.g. images, javascript, css, ...) from third-party servers (e.g. external CDNs) is an end-user privacy violation.

Related Objects

StatusSubtypeAssignedTask
OpenNone
OpenNone
OpenNone
OpenNone
ResolvedKenrick95
ResolvedDanmichaelo
OpenNone
ResolvedAsh_Crow
ResolvedKrinkle
OpenNone
ResolvedJarry1250
ResolvedAddshore
ResolvedSurlycyborg
OpenNone
ResolvedYarl
ResolvedBeta16
Resolvedferveo
ResolvedTheresNoTime
ResolvedEmijrp
ResolvedMyst
ResolvedEarwig
OpenNone
ResolvedFnielsen
OpenNone
ResolvedNone
ResolvedRicordisamoa
OpenNone
ResolvedRanjithsiji
OpenNone
ResolvedEpantaleo
OpenNone
ResolvedFastily
InvalidNone
OpenNone
OpenNone
Resolvedvalhallasw
ResolvedFramawiki
Declinedbd808
ResolvedCyberpower678
ResolvedSymac
ResolvedNone
ResolvedxSavitar
ResolvedTheresNoTime
ResolvedAhecht
ResolvedJackPotte
ResolvedAviator
OpenNone
OpenNone
OpenNone
OpenNone
ResolvedKrinkle
ResolvedTheDJ
Resolvedyuvipanda
ResolvedMatthewrbowker
Resolvedjrbs
ResolvedSamwilson
OpenNone
ResolvedMusikAnimal
ResolvedMooeypoo
ResolvedTheresNoTime
OpenNone
ResolvedPintoch
ResolvedFramawiki
ResolvedMaxSem
OpenNone
ResolvedSlashme
ResolvedIncola
OpenNone
ResolvedKenrick95
ResolvedTgr
ResolvedBenjavalero
ResolvedRicordisamoa
ResolvedFramawiki
OpenNone
ResolvedMmarx
Resolved Prtksxna
ResolvedArlolra
ResolvedFastily
ResolvedSuperHamster
ResolvedFramawiki
ResolvedIjon
ResolvedSmalyshev
ResolvedFnielsen
ResolvedFramawiki
OpenNone
OpenNone
Resolvedcdrini
ResolvedTarrow
ResolvedDB111
ResolvedRicordisamoa
OpenNone
ResolvedxSavitar
OpenNone
OpenNone
OpenNone
ResolvedRLuts
ResolvedEmijrp
ResolvedSamwilson
Resolved jmatazzoni
ResolvedSamwalton9-WMF
Resolveddbarratt
ResolvedLegoktm
ResolvedHusky
ResolvedMagnus
ResolvedKolossos
ResolvedLokal_Profil
OpenNone
ResolvedFramawiki
Resolvedsamuelguebo
ResolvedRagesoss
OpenNone
ResolvedRammanojpotla
ResolvedRagesoss
StalledNone
Resolvedthcipriani
Resolvedsrishakatux
ResolvedPremeditated
ResolvedWMDE-leszek
Resolvedtaavi
ResolvedDanilo
ResolvedDineshkarthik
Resolvedsimon04
OpenNone
ResolvedEderporto
OpenNone
OpenNone
OpenNone
OpenNone
OpenEugene233
OpenNone
OpenNone
OpenHimacharanbatchu

Event Timeline

Give examples in the Terms of Use, eg reCAPTCHA, CDN, etc. Suggest that tools-static be used instead.

I'm not sure if the premise is actually true. We had a user redirecting http://tools.wmflabs.org/$tool to their own private site (which is a very, eh, complete third-party web interaction), so I asked WMF Legal about that:

From: Tim Landscheidt <tim@tim-landscheidt.de>
Subject: Silent redirects from Wikimedia Tools to third-party sites?
To: legal@wikimedia.org
Cc: […]
Date: Wed, 09 Jul 2014 11:07:20 +0000 (1 year, 35 weeks ago)
Organization: http://www.tim-landscheidt.de/

Hi,

throughout enwp, links like
http://tools.wmflabs.org/[…]/[…]
are advertised for the […] tool (there are similar
links for other tools of this user).  However, these redi-
rect silently to
http://[…] & Co.

My gut feeling says that when a user clicks on such a link,
he has an expectation of privacy according to the relevant
policy for Wikimedia Tools and the silent change in the URL
field does not put the onus to detect the connection to a
third-party site on him (especially since with most browsers
his data is passed onto the third-party site before he has a
chance to react), but that the tool author at Wikimedia
Tools has an obligation to make any such redirect obvious,
for example with an interstitial that requires active user
consent to proceed.

What is WMF Legal's reading on that?  Please make any answer
publishable so that tool authors (or users) can be referred
to it.

TIA,
Tim

I never received a reply, so I assume that this is not as clear-cut as it would seem.

I'm not sure why you never received a reply, but it's very clear-cut. Indeed, there was at least one user whose access to labs was suspended because of that very reason.

It's a question of notice - enforce; not one of allowability.

I've seen some tools use Google Analytics too, and there is nothing in the Terms of Use at the moment that say not to. It would make sense to have it in the Terms of Use as saying very clearly that this is something that we do not allow.

Thanks to everyone for bringing this issue to our attention.

There has also been some discussions on labs-l about changing and clarifying the Labs Terms of Use so this is definitely something we will looking into changing as part of the process.

@tom29739

I've seen some tools use Google Analytics too, and there is nothing in the Terms of Use at the moment that say not to. It would make sense to have it in the Terms of Use as saying very clearly that this is something that we do not allow.

I assume this comment is in line with all the other comments. That the issue is a lack of a user consent for the use of third-party interactions rather than wholly disallowing its use?

I'd be all for just disallowing, but yeah my word of mouth understanding is that 3rd party server interactions should require consent. It would be nice to get things clarified one way or the other.

Thanks Bryan.

Will do. We'll keep this task updated as we look into updating the TOU.

I've been told that 3rd party server interactions were not allowed full-stop, so it would be good to get it clarified.

This comment was removed by Dzahn.

hey @ZhouZ! Thanks for digging into this.

I'd be all for just disallowing, but yeah my word of mouth understanding is that 3rd party server interactions should require consent. It would be nice to get things clarified one way or the other.

This has been my impression as well.

As a kind of overview to relate explanations I have seen to our current text:

https://wikitech.wikimedia.org/wiki/Wikitech:Labs_Terms_of_use#What_uses_of_Labs_do_we_not_like.3F

  1. Illegal or harmful activity: Do not break the law. This includes, but it is not limited to, accessing other systems without authorization, accessing private data without authorization, harassing or abusing others, engaging in fraud, trafficking in unlawful material, or distributing unsolicited commercial email (spam), viruses, malware, or other malicious code.

accessing other systems without authorization

This to me has indicated that you cannot use external user tracking services or, really, any system outside of the universe of Labs and the protective TOU without authorization from the user. Which could be a "hey we are doing x" acknowledgement?

  1. Misuse of Private Information: Do not collect or misuse private information of users, as defined in “Private Information”, below.

We do identify 'IP' as private information in the expanded section and any system outside of Labs makes that a private information leak as far as I can understand it. Which may be permissible with notice and etc, but seems in violation without following the guidance in https://wikitech.wikimedia.org/wiki/Wikitech:Labs_Terms_of_use#What_can_and_can.E2.80.99t_be_done_with_user_information.3F

  1. Proprietary software: Do not use or install any software unless the software is licensed under an Open Source license.

This becomes a shell game if we are cavalier about passing requests to systems outside of Labs because anyone can easily put the bulk of their logic in some propriety off Labs system effectively neutering this provision.

  1. Proprietary content: Do not use or create content unless it complies with the Wikimedia Licensing policy. This includes content in the public domain or freely licensed under an applicable free culture license, such as the Creative Commons Attribution ShareAlike 3.0 license.

Same issues as 3 I believe.

  1. Using Wikimedia Labs as a network proxy: Do not use Wikimedia Labs servers or projects to proxy or relay traffic for other servers. Examples of such activities include running Tor nodes, peer-to-peer network services, or VPNs to other networks. In other words, all network connections must originate from or terminate at Wikimedia Labs.

In other words, all network connections must originate from or terminate at Wikimedia Labs.

I'm not entirely sure how expansive this statement was meant to be but it could be relevant.

Hi @chasemp,

Thanks for pointing out these potential inconsistencies and lack of clarity on these terms. These are all things we will be looking at as we begin the process of the revising the labs terms of use, probably starting off with a community mini-consultation.

Illegal or harmful activity: Do not break the law. This includes, but it is not limited to, accessing other systems without authorization, accessing private data without authorization, harassing or abusing others, engaging in fraud, trafficking in unlawful material, or distributing unsolicited commercial email (spam), viruses, malware, or other malicious code.
accessing other systems without authorization
This to me has indicated that you cannot use external user tracking services or, really, any system outside of the universe of Labs and the protective TOU without authorization from the user. Which could be a "hey we are doing x" acknowledgement?

This is not very clear but I think this is actually referring to accessing another system without authorization from the system or system's administrator (e.g. do not hack into another system).

Proprietary software: Do not use or install any software unless the software is licensed under an Open Source license.
This becomes a shell game if we are cavalier about passing requests to systems outside of Labs because anyone can easily put the bulk of their logic in some propriety off Labs system effectively neutering this provision.

Perhaps another way to approach this is to make it clear (assuming we want a strict policy) that you cannot link or use another resource that's not under an open license policy.

Proprietary content: Do not use or create content unless it complies with the Wikimedia Licensing policy. This includes content in the public domain or freely licensed under an applicable free culture license, such as the Creative Commons Attribution ShareAlike 3.0 license.
Same issues as 3 I believe.

We might just want to clarify that to the extent any content is actually visible to a user accessing a labs project or being created via user input on the labs project, such content should adhere to the WMF licensing policy. I don't think we can tell users to never link to a page that's not under our Wikimedia Licensing policy.

Zhou

There actually is a potential technical solution to disallowing loading javascript, images, etc from external domains. The Content-Security-Policy HTTP header provides instructions to the receiving User Agent (web browser) about what origins should be allowed for various interactions. The generic intent of this header is to limit potential abuse by malicious scripts. It allows defining a whitelist of domains that are ok to interact with for various actions.

A header like Content-Security-Policy: default-src 'self' 'unsafe-inline' data: *://*.wmflabs.org:* *://*.wikimedia.org:* *://*.mediawiki.org:* *://*.wikipedia.org:* ... could be injected into all responses at the HTTP proxy to provide browsers with a whitelist of acceptable content origins. The ... would need to list all other Wikimedia TLDs that are acceptable and would probably end up being a pretty long list. I'd be interested to hear what @csteipp thinks about the feasibility of such a solution.

As discussed on irc a bit this seems right to me. Thanks @bd808

@bd808 Does that stop links to external sites, or just the loading of javascript, images, etc. on the page from external sites?

@bd808 Does that stop links to external sites, or just the loading of javascript, images, etc. on the page from external sites?

It would not stop external linking, and I do not think that external linking is actually something that we want or need to prevent. The privacy aspect is about automatically loading content from external domains in the form of images, css, javascript, iframe embeds, etc.

I created T130748: Add Content-Security-Policy header enforcing 3rd party web interaction restrictions to proxy responses to discuss the technical details of enforcement once we have decided here what the official policy actually is.

@bd808

I will publish a new proposed draft of the labs terms of use for community consultation, incorporating the feedback from Round 1.

I have been busy with some other matters recently so I think it be until end of the month/early August before this is ready.

Zhou

Hi, I'm particularly interested by this task. Any news ?

Zhou is not around anymore and there are no news yet (if there were, they'd be mentioned here) unfortunately. Someone with time and resources would need to pick this up again and find someone with time and resource in Legal.

(Please avoid contentless and useless "bumping" - thanks.)

There has been a small bit of work on this project in the past 6 months, but there is no resolution yet. I will provide updates if and when there is information that can be shared.

@Aklapper thank you for the update. I think that asking whether there have been any changes, after a suitable period of time has passed, is fine. In addition to prompting people to update Phabricator with relevant info that they can share, this helps to flag tasks which may have stalled.