Some abuse filters can be bypassed (never match any actions)
Closed, ResolvedPublic

Description

Upload abuse filters can be occasionally bypassed somehow.

On 26 November 2016‎, Panoramio upload bot has somehow managed to upload https://commons.wikimedia.org/wiki/File:FIRMA_COTTEN_IND_ILOL_-_panoramio.jpg (upload log). This image is a copy of the Windows 7 stock image "Chrysanthemum.jpg", and should have been prevented by filter 31 (SHA1 f5f8ad26819a471318d24631fa5055036712a87e matches: imageinfo).

Upload abuse filters still appear to work most of the time to disallow various uploads (abuse log), but that one time it clearly didn't work and I'm concerned. I could not reproduce the problem locally with an identical filter (the upload was prevented, for each of the few ways I tried).

matmarex created this task.Dec 14 2016, 5:56 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptDec 14 2016, 5:56 PM
Restricted Application added a project: Multimedia. · View Herald TranscriptDec 14 2016, 6:02 PM

OK, I see… I can reproduce the issue locally with $wgAbuseFilterParserClass = 'AbuseFilterCachingParser';. This was enabled for Commons with https://gerrit.wikimedia.org/r/#/c/316354/ on 2016-10-17. I suppose there's something about this specific filter that makes the new parser produce wrong results.

matmarex added a subscriber: ori.Dec 14 2016, 6:24 PM

That config change has already caused different filters to break twice, and was reverted and re-enabled: T148660, T148673. I think we should revert it again ASAP, and keep it reverted this time until @ori or @vvv prove that it's not completely broken (e.g. by writing a test suite for it).

matmarex added a subscriber: vvv.Dec 14 2016, 8:14 PM
matmarex added a subscriber: thcipriani.

Patch deployed by @thcipriani and submitted to Gerrit as https://gerrit.wikimedia.org/r/327269.

matmarex closed this task as "Resolved".Dec 14 2016, 8:15 PM
matmarex triaged this task as "High" priority.
matmarex changed the visibility from "Custom Policy" to "Public (No Login Required)".

Can you file a separate task about the bug in AbuseFilterCachingParser?

matmarex changed the title from "Upload abuse filters can be occasionally bypassed somehow" to "Some abuse filters can be bypassed (never match any actions)".Dec 14 2016, 8:55 PM
matmarex added subscribers: Jdforrester-WMF, greg.

All production wikis were affected by this issue, in the following periods. But other than Commons' filter 31, we can't say which filters have been affected, until someone actually debugs the problem. For now we just reverted to a known good version of the code.

WikiAffected sinceCommit
mediawiki.org and Testwiki2016-10-13ccbd700e720f9f2df2abf2b74050f0eadee0bd56
Commons and Meta-Wiki2016-10-173d14a0d9fd08c1424717e785bdb7cc1399c8421b
All other wikis2016-11-162cb4b59548c5fa336152d7cdfc287545e18a097a

We should announce to the communities that this happened. (Perhaps after we know which filters were broken, if anyone is planning to investigate.)

Ideally, we would also re-run the filters for all the actions in this time period, but there is currently no mechanism to do this and it would be a significant effort to develop and test one (in the ballpark of a person-week, I guess), and it could take a while to run such a script. (While we can't retroactively prevent actions from being made, we could still tag the edits and add entries to abuse log, etc.) @greg @Jdforrester-WMF Could you consider this and decide whether it's worth it?

matmarex edited projects, added Commons; removed AbuseFilter, MediaWiki-Uploading.

Can you file a separate task about the bug in AbuseFilterCachingParser?

Thanks for the reminder, filed T153251 for AbuseFilter, let's keep this task about the Wikimedia perspective.

Tgr added a subscriber: Tgr.EditedDec 14 2016, 9:43 PM

Filters can be hand-tested on recent changes, maybe just ask admins to do that? It wouldn't leave a tag/log audit trail though.

I don't think it's worth trying to retrospectively re-run AbuseFilters (especially as it looks like many of the important ones deny upload/creation which would be impossible to "tag"). However, I think @Tgr is right and we should highlight this to sysops (maybe just in Tech/News, maybe in a more prominent way?).

Johan added a subscriber: Johan.EditedDec 15 2016, 1:28 PM

Something like this for next issue of Tech News?

Some abuse filters for uploaded files have not worked as they should. We don't know exactly which filters didn't work yet. This means some files that filters should have prevented from being uploaded were uploaded to the wikis. MediaWiki.org and Testwiki have been affected since October 13. Commons and Meta have been affected since October 17. Other wikis have been affected since November 17.

I think that's fine, although it's possible that other filters (not just those about action='upload') were also affected – we literally have no idea.

@vvv did the debugging and found out the reason: T153251#2884699. In short, the problem affected filters using variadic functions (functions that can take any number of arguments, like contains_any) where there is a trailing comma , before the closing parenthesis ).

The following wikis and the following filters were likely affected. I ran this SQL query on all databases to find them:

select 'foowiki' as wiki, af_id
from foowiki.abuse_filter
where af_pattern rlike ',[ \t\v\f\r\n]*(/\\*.*\\*/[ \t\v\f\r\n]*)*\\)'
and af_enabled and not af_deleted
Wiki nameFilter number
arwiki64
commonswiki31
cswiki2
dawiki11
elwiki3
enwiki755
enwiki756
eswiktionary6
fawiki3
fawiki9
fawiki19
fawiki33
fawiki120
fawikibooks10
fawikinews11
fawikinews14
fawikinews23
fawikiquote2
fawikivoyage2
fawikivoyage8
fawiktionary9
fiwiki124
gagwiki1
hiwiki43
hiwiki61
hiwiki80
jawiki18
jawikibooks2
jawikiquote1
jawikisource1
jawikisource2
kkwiki3
metawiki114
mlwiki3
mrwiki21
nnwiki5
nowiki4
ruwiki14
svwiki85
testwiki81
testwiki108
zh_yuewiki3
zh_yuewiki20
zhwiki6
zhwiki31
zhwiki37
zhwiki38
zhwiki39
zhwiki53
zhwiki55
zhwiki92
zhwiki191
zhwikinews12

I think we should send an extra MassMessage to village pumps on these 32 wikis, with a list of their affected filters. @Johan Can you help?

FYI, enwiki filters 755 and 756 are mine and they're not affected AFAICT: they don't use variadic functions and the ,) arises inside a regex. May I also suggest sending any enwiki message also to the Edit Filter Noticeboard? Thanks.

The following wikis and the following filters were likely affected. I ran this SQL query on all databases to find them:

select 'foowiki' as wiki, af_id
from foowiki.abuse_filter
where af_pattern rlike ',[ \t\v\f\r\n]*(/\\*.*\\*/[ \t\v\f\r\n]*)*\\)'
and af_enabled and not af_deleted
Wiki nameFilter number
arwiki64
commonswiki31
cswiki2
dawiki11
elwiki3
enwiki755
enwiki756
eswiktionary6
fawiki3
fawiki9
fawiki19
fawiki33
fawiki120
fawikibooks10
fawikinews11
fawikinews14
fawikinews23
fawikiquote2
fawikivoyage2
fawikivoyage8
fawiktionary9
fiwiki124
gagwiki1
hiwiki43
hiwiki61
hiwiki80
jawiki18
jawikibooks2
jawikiquote1
jawikisource1
jawikisource2
kkwiki3
metawiki114
mlwiki3
mrwiki21
nnwiki5
nowiki4
ruwiki14
svwiki85
testwiki81
testwiki108
zh_yuewiki3
zh_yuewiki20
zhwiki6
zhwiki31
zhwiki37
zhwiki38
zhwiki39
zhwiki53
zhwiki55
zhwiki92
zhwiki191
zhwikinews12

I don't think commonswiki filter 31 is not affected, it is still working and disallowing frequently uploaded non-free files.

Restricted Application removed a subscriber: Pokefan95. · View Herald TranscriptDec 20 2016, 7:46 AM
Restricted Application removed a subscriber: Pokefan95. · View Herald TranscriptDec 20 2016, 7:47 AM

@Pokefan95 It is working now, after we reverted the buggy changes, but it used to be broken, and copies of these files were uploaded (see a list in my earlier comment).

Restricted Application removed a subscriber: Pokefan95. · View Herald TranscriptDec 20 2016, 8:47 AM
Johan added a comment.Jan 3 2017, 3:00 AM

@matmarex Sorry, was on holiday break and didn't see this ping until now. Still want my help?

Eh, so was I, and I kind of forgot about this to be honest. Perhaps we don't need to do that, this task was linked from that Tech News and anyone interested would have found it.