Commons and to a lesser extent other projects used as video hoster / file sharing site by Wikipedia Zero
Open, NormalPublic

Description

Wikipedia Zero users are thinking that http://test.wikipedia.org/ and commons is some sort of YouTube. They uploaded thousands of out-of-scope/copyright violation files to wiki.
Now blocked by abuse filter: https://test.wikipedia.org/wiki/Special:AbuseFilter/160 on testwiki.
A example on commons can be found here

I also had a talk with Alex Z on irc, citing part of here:

14:39:32 <AlexZ> Steinsplitter: I've been getting individual emails from the hundreds of users I've blocked there. They all seem to be under the impression that's that's what test wiki is for... to be a "free YouTube" for Wikipedia Zero.
(...)
4:47:33 <AlexZ> I even saw someone try to upload a how-to video on uploading movies to test...
14:50:38 <AlexZ> https://test.wikipedia.org/wiki/Special:AbuseLog/32063

For Commons see also:
https://commons.wikimedia.org/wiki/User:Teles/Angola_Facebook_Case
https://commons.wikimedia.org/wiki/User:NahidSultan/Bangladesh_Facebook_Case

See this as well:
Mar-2016: https://pt.wikipedia.org/wiki/Wikipédia:Esplanada/geral/Operadoras_angolanas_disponibilizam_acesso_gratuito_à_Wikipedia_(16mar2016)
Jul-2015: https://meta.wikimedia.org/wiki/Wikimedia_Forum/Archives/2015-07#Wikipedia_Zero_being_used_to_violate_copyright

Also see Commons AN discussions regarding embedded data (concatenated archives):
Nov-2016: Influx of files with embedded data (CSD#F9)
Dec-2016: Influx of files with embedded data (CSD#F9) – continuation
Jan-2017: Influx of files with embedded data (CSD#F9) – continuation 2 and Telenor Wikipedia Zero issue
Mar-2017: Influx of files with embedded data (CSD#F9) – continuation 3 (was: Wikipedia Zero Abuse)
April-2017: Your Freedom - VPN exit nodes and Proposing mass block of Microsoft Azure Datacenter IP Ranges
June-2017: Influx of files with embedded data (CSD#F9) – continuation and related Proposal: Restrict Video Uploading

Commons: Wikipedia Zero recent changes (edits | uploads), Large uploads by new users; Abuse filters: JPEG, PNG, GIF,
TIFF, other formats; "WP0 Abuse", and "WP0 Abuse" 2
Testwikis: Mass upload stop filter 160 on testwiki and filter 7 on test2wiki

Related Objects

There are a very large number of changes, so older changes are hidden. Show Older Changes

! In T129845#2218072, @NahidSultan wrote:

Wikimedia Bangladesh is aware of this situation and we're trying our best to prevent any further mess. We're trying to create awareness through social medias by raising this issue and contacting individual group/page admins requesting them not to promote copyrighted materials on commons. Among them two groups have already announced (1, 2) that they will not continue further copyrighted uploads in a response to this Facebook post that was posted few hours ago from WMBD's official Facebook page. Lets just hope that they will stick to their words.

Good news, I put your text into translator and it seems to be properly targeted! They can actually share media via commons, we can somehow be their youtube but well, just for PD stuffs.

Wikimedia Bangladesh is aware of this situation and we're trying our best to prevent any further mess. We're trying to create awareness through social medias by raising this issue and contacting individual group/page admins requesting them not to promote copyrighted materials on commons. Among them two groups have already announced (1, 2) that they will not continue further copyrighted uploads in a response to this Facebook post that was posted few hours ago from WMBD's official Facebook page. Lets just hope that they will stick to their words.

That's great news Nahid! good work.

@Yurik Assuming you're the one that created data.wmflabs.org, can you please disable uploads on that wiki, before Wikipedia Zero know that we have a wiki that is vulnerable to copyright violations? Thanks.

Yurik added a comment.Apr 20 2016, 8:30 PM

@Pokefan95, done, but I suspect that there are tons of various wmflabs vagrant instances that allow file uploads. Also, I am a bit confused why wmflabs is in the same ip range as production - I think we should separate the two.

jayvdb added a comment.EditedApr 20 2016, 11:42 PM

According to T131934, tool labs is a totally different ip range, and only production is zero rated.

NahidSultan added a comment.EditedApr 24 2016, 9:32 AM

Update: It seems that Wikimedia Bangladesh's awareness is working (though it's a bit early to say). After discussing with different Facebook pages and group admins individually, most of them have agreed to stop uploading copyrighted videos. We're still receiving those uploads but mostly from individual users, not from a group. To be precise, recent uploads are coming from only one/two Facebook user(s) through various usernames (sometimes in other languages) based on the evidence from Facebook groups.

DFoy added a comment.Apr 24 2016, 9:53 AM

@NahidSultan - great to hear that your approach is having positive results! I will be in touch with the mobile operator there to verify that they are also seeing the abuse level off.

Denniss removed a subscriber: Denniss.Apr 24 2016, 10:55 AM
I will be in touch with the mobile operator there to verify that they are also seeing the abuse level off.

That will be a good idea to know their perspective as well on this.

Yurik merged a task: Restricted Task.Apr 27 2016, 6:14 PM
Yurik added subscribers: csteipp, BBlack, akosiaris, MaxSem.
Yurik added a comment.Apr 28 2016, 2:32 PM

Social aspect: as discussed in Facebook WP weekly group, there is an article on the topic.

Social aspect: as discussed in Facebook WP weekly group, there is an article on the topic.

The article... speechless...

Hello,

I'm a radio producer with the BBC and am hoping to speak to a member of this group about the Wikipedia Zero piracy issue.

Would anyone be happy to do a pre-recorded telephone interview to be broadcast on the BBC World Service?

Many thanks,

Sam

Hello,

I'm a radio producer with the BBC and am hoping to speak to a member of this group about the Wikipedia Zero piracy issue.

Would anyone be happy to do a pre-recorded telephone interview to be broadcast on the BBC World Service?

Many thanks,

Sam

Probably not a good place for this here, though you can check https://wikimediafoundation.org/wiki/Press_room for press contacts.

Thanks for getting back to me Matthew - I've put a request in to the Wikimedia foundation as well.

But I'd still like to to hear from someone involved in the day to day work of preventing piracy on the platform.

If anyone is happy to speak over the phone, I'd be very grateful.

Poyekhali moved this task from Incoming to Backlog on the Commons board.Jun 8 2016, 9:20 AM
Gunnex added a comment.Jun 8 2016, 6:35 PM

! In T129845#2207586, @Gunnex wrote on 16.04.2016:
Getting close to 300 accounts now (currently 291)...

Update: Per User:NahidSultan/Bangladesh Facebook Case/Accounts = 599 accounts...

See also GitHub: access restrictions =

The idea is to restrict access to video2commons only for a certain user group, based on status (auto-confirmed) or on (living) user edits (> 20, 50, 100, X) as all uploads are coming from fresh registered users (rarely: or from 0-edit sleepers).

and

In T129845#2365389, @Gunnex wrote on 08.06.2016:

! In T129845#2207586, @Gunnex wrote on 16.04.2016:
Getting close to 300 accounts now (currently 291)...

Update: Per User:NahidSultan/Bangladesh Facebook Case/Accounts = 599 accounts...

676 accounts...

MarkTraceur lowered the priority of this task from High to Normal.Dec 5 2016, 9:29 PM
MarkTraceur added a subscriber: MarkTraceur.

Lowering priority, because it seems like no progress has been made for a while, but also, I'm not sure exactly what solution is being sought here. Can someone elaborate on what technical steps should be taken to prevent this, apart from the existing AbuseFilter solution (which is already implemented)?

MarkTraceur moved this task from Untriaged to Tracking on the Multimedia board.Dec 5 2016, 9:29 PM

The AF only marked uploads for ease of review, and may have indirectly discouraged the copyvios.

Regarding Bangladesh Facebook Case: AFAIK, after WMBD's awareness post and reached out to several Facebook groups, some of them quit. Also v2c became closed to them (edit count + user age requirement added, iirc), forcing them to upload the files in the old way. @Gunnex and @NahidSultan should know more on this case.

Unfortunately, a similar case has just begun. Myanmar Facebook groups has been observed to abuse T48921: Refuse uploading JPEG files with extra junk at the end. to share files also via zero-rating. Perhaps another task should be filed about this.

This page has been updated this month: https://commons.wikimedia.org/w/index.php?title=User%3ATeles%2FAngola_Facebook_Case&type=revision&diff=236118306&oldid=194233576

It seems that those uploaders just find other projects where they can publish copyrighted files when we block them in the first projects where they tried before. De.wiki, ca.wiki, br.wikimedia, the wiki for 2016 Wikimania, are some of the projects currently used.

Can I recommend disallowing uploads other than images for unconfirmed users on projects other than Commons? I hope this is not stepping on their toes, since it seems to be the best solution.

IMO it would have the least impact, since we are handling the issue with more precision on Commons. (i.e., new non-wiki Zero users still can upload these media).

Can I recommend disallowing uploads other than images for unconfirmed users on projects other than Commons?

Local upload already requires autoconfirmed to encourage uploading to Commons. We should avoid making user permissions overly complex; we can and should disable local uploads altogether on wikis where they're not monitored (hopefully de.wiki and ca.wiki are not such).

FYI we're seeing yet another influx of uploads from another Wikimedia Zero project. I now count Wikimedia Zero projects in 4 different countries as having coordinated file sharing campaigns on Commons.

The latest is using embedded data inside PNG, PDF, and OGG files.

The latest is using embedded data inside PNG, PDF, and OGG files.

Are there bug reports already against MediaWiki-Uploading to detect those?

zhuyifei1999 renamed this task from Commons and testwiki used as video hoster by Wikipedia Zero to Commons and to a lesser extent other projects used as video hoster / file sharing site by Wikipedia Zero.Mar 29 2017, 6:13 AM
zhuyifei1999 updated the task description. (Show Details)
Revent added a subscriber: Revent.Apr 3 2017, 11:22 AM
tomasz added a subscriber: tomasz.Apr 3 2017, 1:49 PM
Revent added a comment.Apr 3 2017, 9:08 PM

https://commons.wikimedia.org/wiki/Commons:Administrators%27_noticeboard/Blocks_and_protections#Your_Freedom_-_VPN_exit_nodes

Some WP0 pirate uploaded, today, a video detailing how to use the https://www.your-freedom.net/ Android VPN client to evade blocks on Commons. Presumably, this is a somewhat common tactic.

Investigation (by installing the client myself) showed that the exit nodes for the VPN were located on cheap cloud servers. Other than a couple of specific cases, where the ASN was either not owned by the same company, or it appears to sell connectivity to end-user ISPs, I have (after discussion with a few more admins) rangeblocked the ASNs of the cloud hosting providers affected from Commons. Hopefully this will cut down on the piracy somewhat, at least temporarily, and with luck it will discourage a few.

I'll keep an eye on the exit IPs over the next few days (presumably they jump around, since that is rather the point) and see if I can narrow them down (since the sum of the blocked ranges is quite wide). Presumably, however, nobody should be 'legitimately' editing Commons from such a service without being exempted, but I've left talk pages open so they can appeal.

Teles added a comment.Apr 4 2017, 3:12 PM

https://commons.wikimedia.org/wiki/Commons:Administrators%27_noticeboard/Blocks_and_protections#Your_Freedom_-_VPN_exit_nodes

Some WP0 pirate uploaded, today, a video detailing how to use the https://www.your-freedom.net/ Android VPN client to evade blocks on Commons. Presumably, this is a somewhat common tactic.

Investigation (by installing the client myself) showed that the exit nodes for the VPN were located on cheap cloud servers. Other than a couple of specific cases, where the ASN was either not owned by the same company, or it appears to sell connectivity to end-user ISPs, I have (after discussion with a few more admins) rangeblocked the ASNs of the cloud hosting providers affected from Commons. Hopefully this will cut down on the piracy somewhat, at least temporarily, and with luck it will discourage a few.

I'll keep an eye on the exit IPs over the next few days (presumably they jump around, since that is rather the point) and see if I can narrow them down (since the sum of the blocked ranges is quite wide). Presumably, however, nobody should be 'legitimately' editing Commons from such a service without being exempted, but I've left talk pages open so they can appeal.

That range should probably be globally blocked, so it is not used outside of Commons. Could you please send mail to stewards@wikimedia.org with that information?

Revent added a comment.Apr 5 2017, 3:02 AM

@Teles I listed the rest of the ASNs in the AN/B thread. The specific ranges that I blocked (and there are a lot) can be seen in my (and Odder's) block logs... they are tagged as 'VPN exit'.

Revent added a comment.Apr 5 2017, 3:08 AM

By a lot, I mean several hundred ranges, on the scale of /20 and below. When a VPN exit is located, the owning ASN is shown by tools such as http://whatismyipaddress.com/ip-lookup. There are a number of different websites that expose the ranges operated by a specific ASN (you can just google the ASN), and if the owner is a cloud provider it's IMO reasonable to just block them all... anyone editing from such an IP is almost certainly attempting to evade a block, and any legitimate users of such VPNs (to evade national firewalls of wikipedia) can ask for an exemption.

as a Checkuser, I can confirm that there are a LOT of ranges. From Morocco, we've seen a lot of abuse from:

two /11 ranges
a /12 range
a /14 range

As a checkuser, I don't know if I'm allowed to disclose the exact ranges, so I will refrain.

Wow, those ranges are gigantic. We might block an entire ISP if we were to block them.

Wow, those ranges are gigantic. We might block an entire ISP if we were to block them.

I am not a fan of creating yet a other ABF rule, but as ulitma ration (if needed; it is hardcore) we can block uploads (by accounts with less than x-edits) coming in over the WP Zero-Rating Carriers.

I am not a fan of creating yet a other ABF rule, but as ulitma ration (if needed; it is hardcore) we can block uploads (by accounts with less than x-edits) coming in over the WP Zero-Rating Carriers.

The ABF rules for JPEG and PNG are pretty good at blocking abuse.[1] These formats account for 91% of all files on Commons and 97% of uploads by new users since 2015.[2]

Blocking other formats means it could work like autoconfirmed. You only get to upload videos after clearing a low bar. We can add new formats as either ABF improves (T154987) or Embedded Data Bot gets more sophisticated.

[1] Excluding Animated PNGs and proprietary programs (Fireworks and PictureIt) storing data in PNG's private chunks.

/* [2] Uploads by new Users. Run time: 20 minutes */
SELECT img_media_type, img_major_mime, img_minor_mime, COUNT(DISTINCT user_id), COUNT(*)
FROM image
JOIN user ON user_id = img_user
WHERE img_timestamp > "2015"
AND img_timestamp BETWEEN user_registration 
AND DATE_FORMAT(user_registration + INTERVAL 2 WEEK, "%Y%m%d%H%i%s")
GROUP BY 1,2,3;
FormatUploadsPercent
jpeg87178188.69%
png849188.64%
svg+xml112761.15%
gif58080.59%
tiff42670.43%
Ogg Vorbis20580.21%
Other28950.29%
Dispenser added a subscriber: Zack.Apr 11 2017, 3:39 PM
Dispenser updated the task description. (Show Details)Apr 13 2017, 1:52 PM
Base added a subscriber: Base.Sun, May 28, 3:43 PM
alanajjar added a subscriber: alanajjar.EditedTue, Jun 6, 1:06 AM

As I wrote here, I remembered that before 2 days a new user sent to me a message about thing I think it's related to this topic, he said "Hello, please be comfortably to understand my topic, there's some users who merge applications in pictures form to upload it in free on the wiki which they access it in free on Morocco Telecom Network, that lead to violated the policies of Wikimedia and encrypted files. This is not suitable for Wikipedia". Really I don't understand him and what he mean, but I think now I can understand a part of what he said. Wish what I said help you.

I am following some related facebook groups, and I noticed the follwoing

A webm file that contains an apk file, is already deleted on commons
https://commons.wikimedia.org/wiki/File:20170615122111!20170615014039!Younes.webm

it is still accessible on our servers
https://upload.wikimedia.org/wikipedia/commons/archive/3/3e/20170615122111%2120170615014039%21Younes.webm

and it was just promoted a few minutes on faceback
https://www.facebook.com/alamandroidnew/posts/272371346569397

Jeff_G added a subscriber: Jeff_G.Fri, Jun 16, 1:48 PM
This comment was removed by Jeff_G.

HI ! I am following some related facebook groups, and I noticed the follwoing

it is still accessible on our servers :

https://upload.wikimedia.org/wikipedia/commons/archive/3/3e/20170615122111%2120170615014039%21Younes.webm

Younes19956 added a comment.EditedFri, Jun 16, 11:30 PM

Can i ask You ? how to delete any file is Dangerous on wikimedia commons ?

Younes19956 added a comment.EditedSat, Jun 17, 9:22 AM

Hello ! I have a large size image (12.3MB) of a high quality and i want to upload it here to show you some information about organized groups on Facebook.
Can you help me please ?
And what should I do ?

Dispenser updated the task description. (Show Details)Sat, Jun 17, 1:26 PM

Now this website is concerned too, phabricator.wikimedia.org.

Thibaut120094 added a comment.EditedSat, Jun 17, 3:55 PM

Now this website is concerned too, phabricator.wikimedia.org.

See also https://phabricator.wikimedia.org/F8472381

zhuyifei1999 updated the task description. (Show Details)Mon, Jun 19, 4:41 AM