Page MenuHomePhabricator

Disable Forward to non-WMF sites in interwiki data
Closed, ResolvedPublic

Description

I propose to disable Forward to non-WMF sites in https://meta.wikimedia.org/wiki/Special:Interwiki, namely:

  1. blw - http://britainloveswikipedia.org/wiki/$1
  2. schoolswp - http://schools-wikipedia.org/wiki/$1
  3. semantic-mw - //www.semantic-mediawiki.org/wiki/$1
  4. wmar - http://www.wikimedia.org.ar/wiki/$1
  5. wmau - http://wikimedia.org.au/wiki/$1
  6. wmil - http://www.wikimedia.org.il/$1
  7. wmph - http://wikimedia.org.ph/wmph/index.php?title=$1
  8. wmuk - https://wikimedia.org.uk/wiki/$1
  9. wmve - http://wikimedia.org.ve/wiki/$1
  10. wmza - http://wikimedia.org.za/wiki/$1

May also include:

  1. sep11 - https://sep11.wikipedia.org/wiki/$1 (redirect to non-WMF archive.org)

These sites are not controled by WMF and may become safety risk. They can be used to forwarding users to sites which direct forward is disabled (https://meta.wikimedia.org/wiki/semantic-mw:translatewiki:). Also it may be possible to forward to sites with improper content using forward chain (though I can not find one).

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

A bunch of the ones you listed *are* Wikimedia sites, they're websites for chapters.

Although they're websites for chapters, the wikis are not managed by WMF. They should got their interwiki table reviewed by WMF.

And wmil is not a MediaWiki site and the formatter allow forwarding users to any pages of this site.

So is this a request to disable forward to non-WMF sites in interwiki data instead of "Disable Forward to non-Wikimedia sites in interwiki data"?

It would probably be useful to go through all our interwikis and make sure that the internal bit is set sanely. There's a bunch of interwikis that do not have the forward flag, who are WMF sites.

Bugreporter renamed this task from Disable Forward to non-Wikimedia sites in interwiki data to Disable Forward to non-WMF sites in interwiki data.Aug 4 2016, 5:41 PM
Bugreporter updated the task description. (Show Details)

Disable forward to non-WMF sites. This does not include any chapter sites held in WMF cluster.

This basically comes from a rather naïve regex in dumpInterwiki.php:

if ( preg_match( '/(wikipedia|wiktionary|wikisource|wikiquote|wikibooks|wikimedia|wikinews|wikiversity|wikivoyage|wikimediafoundation|mediawiki|wikidata)\.org/', $url ) ) {
	$local = 1;
} else {
	$local = 0;
}

As a first step, we could put a \. at the start of the regex (which would eliminate semantic, schools, etc), and also something to anchor it at the end (to get rid of the externally-hosted chapter sites).

Of those chapter wikis at external domains, is it only wmve that is a WMF cluster wiki?

Wikimedia chapters are run by people we trust. I'm not super concerned that they are going to use the open redirect feature to run a phishing attack against our users. That said, it might be good to make our policy for iw_local be consistent.

A reasonable regex might be

/(?:^|\.)(wikipedia|wiktionary|wikisource|wikiquote|wikibooks|wikimedia|wikinews|wikiversity|wikivoyage|wikimediafoundation|mediawiki|wikidata)\.org\//

wmve (wikimedia.org.ve) is not a WMF cluster wiki. See https://meta.wikimedia.org/wiki/Special:SiteMatrix

Looks like you're correct. I was misled by the "A Wikimedia Project" icon in the footer of their pages. Is there a policy on whether non-WMF-controlled chapter wikis can use that icon?

Looks like you're correct. I was misled by the "A Wikimedia Project" icon in the footer of their pages. Is there a policy on whether non-WMF-controlled chapter wikis can use that icon?

@Heather I guess you'd be a good person to ask that question to?

The chapters and thematic organizations have trademark agreements...

Needed to also add a slash at the beginning of the regex for oldwikisource and some others...

diff --git a/dumpInterwiki.php b/dumpInterwiki.php
index d0b41a4..0c3f914 100644
--- a/dumpInterwiki.php
+++ b/dumpInterwiki.php
@@ -297,7 +300,9 @@ class DumpInterwiki extends Maintenance {
 				$prefix = str_replace( ' ', '_', $prefix );
 
 				$url = $matches[2];
-				if ( preg_match( '/(wikipedia|wiktionary|wikisource|wikiquote|wikibooks|wikimedia|wikinews|wikiversity|wikivoyage|wikimediafoundation|mediawiki|wikidata)\.org/', $url ) ) {
+				if ( preg_match( '/(?:^|[\/.])(wikipedia|wiktionary|wikisource|wikiquote|wikibooks|wikimedia|' .
+					'wikinews|wikiversity|wikivoyage|wikimediafoundation|mediawiki|wikidata)\.org\//', $url )
+				) {
 					$local = 1;
 				} else {
 					$local = 0;
diff --git a/iw1 b/iw2
index 972fa5b..b9dc7e6 100644
--- a/iw1
+++ b/iw2
@@ -1,3 +1,3 @@
 <?php
-// Automatically generated by dumpInterwiki.php on Fri, 05 Aug 2016 11:51:31 +0200
+// Automatically generated by dumpInterwiki.php on Fri, 05 Aug 2016 11:58:35 +0200
 return [
@@ -19,3 +19,3 @@ return [
 	'__global:bluwiki' => '0 http://bluwiki.com/go/$1',
-	'__global:blw' => '1 http://britainloveswikipedia.org/wiki/$1',
+	'__global:blw' => '0 http://britainloveswikipedia.org/wiki/$1',
 	'__global:botwiki' => '0 http://botwiki.sno.cc/wiki/$1',
@@ -253,3 +253,3 @@ return [
 	'__global:scholar' => '0 //scholar.google.com/scholar?q=$1',
-	'__global:schoolswp' => '1 http://schools-wikipedia.org/wiki/$1',
+	'__global:schoolswp' => '0 http://schools-wikipedia.org/wiki/$1',
 	'__global:scores' => '0 http://imslp.org/wiki/$1',
@@ -261,3 +261,3 @@ return [
 	'__global:slwiki' => '0 http://wiki.secondlife.com/wiki/$1',
-	'__global:semantic-mw' => '1 //www.semantic-mediawiki.org/wiki/$1',
+	'__global:semantic-mw' => '0 //www.semantic-mediawiki.org/wiki/$1',
 	'__global:senseislibrary' => '0 http://senseis.xmp.net/?$1',
@@ -357,5 +357,5 @@ return [
 	'__global:wlug' => '0 http://www.wlug.org.nz/$1',
-	'__global:wmar' => '1 http://www.wikimedia.org.ar/wiki/$1',
+	'__global:wmar' => '0 http://www.wikimedia.org.ar/wiki/$1',
 	'__global:wmat' => '0 https://mitglieder.wikimedia.at/$1',
-	'__global:wmau' => '1 http://wikimedia.org.au/wiki/$1',
+	'__global:wmau' => '0 http://wikimedia.org.au/wiki/$1',
 	'__global:wmbd' => '1 https://bd.wikimedia.org/wiki/$1',
@@ -381,3 +381,3 @@ return [
 	'__global:wmid' => '0 http://www.wikimedia.or.id/wiki/$1',
-	'__global:wmil' => '1 http://www.wikimedia.org.il/$1',
+	'__global:wmil' => '0 http://www.wikimedia.org.il/$1',
 	'__global:wmin' => '0 http://wiki.wikimedia.in/$1',
@@ -391,3 +391,3 @@ return [
 	'__global:wmpa-us' => '1 //pa-us.wikimedia.org/wiki/$1',
-	'__global:wmph' => '1 http://wikimedia.org.ph/wmph/index.php?title=$1',
+	'__global:wmph' => '0 http://wikimedia.org.ph/wmph/index.php?title=$1',
 	'__global:wmpl' => '1 https://pl.wikimedia.org/wiki/$1',
@@ -401,5 +401,5 @@ return [
 	'__global:wmua' => '1 https://ua.wikimedia.org/wiki/$1',
-	'__global:wmuk' => '1 https://wikimedia.org.uk/wiki/$1',
-	'__global:wmve' => '1 http://wikimedia.org.ve/wiki/$1',
-	'__global:wmza' => '1 http://wikimedia.org.za/wiki/$1',
+	'__global:wmuk' => '0 https://wikimedia.org.uk/wiki/$1',
+	'__global:wmve' => '0 http://wikimedia.org.ve/wiki/$1',
+	'__global:wmza' => '0 http://wikimedia.org.za/wiki/$1',
 	'__global:wm2005' => '1 https://wikimania2005.wikimedia.org/wiki/$1',

Looks good to me. Do we care enough about sep11 to special-case it?

Even doing this doesn't really fix the underlying issue does it? We'll still have some domains redirecting off-site:

km@km-tp ~> curl -I "https://uk.wikimedia.org/" | grep location
location:https://wikimedia.org.uk/

Plus there's also blog.wm.o, status.wm.o, etc. which all load external resources. So I'm not sure what is being gained here.

I'm not to concerned about chapters, but non-WM associated people should definitely be removed (e.g. SemanticMediawiki)

Given this is a low severity issue, we're going to make this discussion public [per discussion at security-team meeting].

I have somewhat mixed feelings on if chapters should have the forward bit or not, but semantic-mediawiki et al definitely should not, if for no other reason than consistency. I'm going to upload a patch for that shortly.

[As an aside, how do you test scripts in WikimediaMaintenance without replicating the WMF setup in its entirety?]

Bawolff changed the visibility from "Custom Policy" to "Public (No Login Required)".Sep 13 2016, 4:59 AM
Bawolff changed Security from Software security bug to None.

Change 310207 had a related patch set uploaded (by Brian Wolff):
Make sure only WMF/Chapter interwikis are internal

https://gerrit.wikimedia.org/r/310207

As an aside, how do you test scripts in WikimediaMaintenance without replicating the WMF setup in its entirety?

If you don't care about your script or it's output potentially being leaked, you can use the existing replica that is beta.wmflabs.org/deployment-prep

Change 310207 merged by jenkins-bot:
Make sure only WMF/Chapter interwikis are internal

https://gerrit.wikimedia.org/r/310207

Liuxinyu970226 raised the priority of this task from Low to Medium.Feb 24 2018, 12:51 PM
Liuxinyu970226 lowered the priority of this task from Medium to Low.
sbassett closed this task as Resolved.EditedDec 9 2019, 5:08 PM
sbassett assigned this task to Bawolff.
sbassett subscribed.

Given that https://gerrit.wikimedia.org/r/310207 was merged several years ago, I'm going to resolve this task for now. If there's any renewed interest in discussing which entities should be allowed interwiki links on the projects, someone can feel free to re-open this task and begin that discussion.

I was misled by the "A Wikimedia Project" icon in the footer of their pages.

It's understandable. But Wikimedia projects are the wikis which perform the functions of the Wikimedia movement. They just happen to be hosted by the Wikimedia Foundation in the interim. The logo there doesn't link wikimediafoundation.org (as it does on WMF-hosted wikis) so it's not incorrect.