Page MenuHomePhabricator

Make AbuseFilter support a filter for cross-wiki uploads
Closed, ResolvedPublic

Assigned To
Authored By
matmarex
Jun 20 2016, 11:19 PM
Referenced Files
F4274972: pasted_file
Jul 14 2016, 8:09 PM
Tokens
"Dislike" token, awarded by saper."Dislike" token, awarded by eranroz."Yellow Medal" token, awarded by Nemo_bis."Like" token, awarded by Poyekhali.

Description

The objective of this task is to make an AbuseFilter like this possible:

action == "upload" &
user_age < 3600*24*180 &
"Cross-wiki upload from" in summary &
(
  !( file_mime like "image/*" ) |
  ( file_mime == "image/jpeg" & file_width*file_height < 10**6 )
)

See https://commons.wikimedia.org/wiki/Commons:Administrators'_noticeboard/Archive_58#AbuseFilter_for_cross-wiki_uploads for context. Activation of specific filters will continue being discussed on the wiki. Without a fix, cruder alternatives may enter into use, like https://commons.wikimedia.org/w/index.php?title=Commons%3AAdministrators%27_noticeboard&action=historysubmit&type=revision&diff=201136540&oldid=201134393

Related Objects

Event Timeline

matmarex added subscribers: INeverCry, Nemo_bis.

@Nemo_bis @Steinsplitter @INeverCry I'd appreciate if you could respond to my comments on the administrators' noticeboard thread. (Should it be un-archived or something?)

As you can see on the blocking tasks I'm working on the AbuseFilter prerequistites, I am hoping to finish my part some time this week and have the code reviewed and merged within two weeks (but ideally sooner).

@matmarex Thank you so much for the help! I highly appreciate it.

I created the filter at https://commons.wikimedia.org/wiki/Special:AbuseFilter/153 with "Prevent the user from performing the action in question". Hope it is fine.

@Steinsplitter I think creating an abuse filter message for that would be useful, to educate users what images are accepted on Commons and what is not. (But don't enable it yet, cross-wiki upload doesn't support abuse filter messages I think)

Since I was already thinking how to implement this in a way that would also make sense for TitleBlacklist, and that could also be consumed by UploadWizard, I wrote patches for them too. That made the work grow considerably, but I think the result is pretty good.

Because I was starting to lose track mysef ;), here's the list of all relevant patches (grouped by repository/area) and their dependencies (patches higher in the table should be merged before the patches lower in the table).

@Steinsplitter The new functionality is not deployed yet, I've just finished implementing it yesterday before I created that table, and now it's going through code review. As I mentioned earlier, I'm hoping that will be finished in a week or two, at which point the filter should magically start working.

I've reviewed mine - the last one is dependent on unmerged things in other projects, so I'm waiting on those. I will also need to run a quick test to make sure it all comes together, once those patches are finally merged.

matmarex triaged this task as High priority.Jul 4 2016, 6:54 PM

+1'd the one I needed to review. Feel free to merge, based on dependencies.

@Nemo_bis @Steinsplitter @INeverCry I'd appreciate if you could respond to my comments on the administrators' noticeboard thread. (Should it be un-archived or something?)

Thanks for your statistics and thoughts, I had missed them. There isn't any question for me to answer, right?

Based on clarifications by MatmaRex, AbuseFilter is actually very close to allowing the desired filter (although its handling may not be the prettiest yet); let's focus on the really required variables.

Nemo_bis renamed this task from AbuseFilter for cross-wiki uploads to Make AbuseFilter support a filter for cross-wiki uploads.Jul 9 2016, 1:02 PM

Yeah, that's true. A solution for T89302: Create check hook before really uploading file with infos about description page and file made both T89252 and T139848 very easy to implement, so I couldn't resist doing the more complete fix ;)

Yeah, that's true. A solution for T89302: Create check hook before really uploading file with infos about description page and file made both T89252 and T139848 very easy to implement, so I couldn't resist doing the more complete fix ;)

That's good, I was just confused by the dependency relationship. :)

T27086: Warning alternates between default and specified warning is not really a blocker for this, either, it's just an improvement in the interface for users who would hit the filter, and then try again.

So, everything should be done! Mostly in time for the deadline I set for ourselves in T138273#2395445 ;)

Now we wait for all the fixes are deployed to Commons with MediaWiki 1.28.0-wmf.10 (per the roadmap, this Wednesday, 13 July 2016; although there's currently a problem with logins and all wikis were rolled back to wmf.8, so this might be delayed). I'll leave this task open until that happenes.

In the meantime, you should be able to test everything on http://commons.wikimedia.beta.wmflabs.org/ and http://en.wikipedia.beta.wmflabs.org/. I documented the new stuff at https://www.mediawiki.org/wiki/Extension:AbuseFilter/Rules_format#Notes for documentation. If it's unclear, don't hesitate to ask on the talk page there (I'm watching it), or here on the task.

(As a consequence of these changes, most existing Commons filters that currently check for action='upload' should be changed to use action='upload' | action='stashupload', to play better with UploadWizard – action='upload' is now only used when publishing a file, and not for the initial upload to stash. I'll do this later today.)

A query to get a list of potentially affected filters: select af_id, af_public_comments, af_deleted, af_enabled from abuse_filter where af_pattern like '%upload%' order by af_enabled desc, af_deleted asc;

I reviewed the non-deleted ones: filters 29, 31, 69, 108, 110, 146, 149, 152, 153. I did not review the deleted ones, assuming they're probably not getting undeleted: filters 81, 83, 114, 121, 137, 140. (Some of the filters are private, so I'm just posting their numbers.)

You can see my changes on https://commons.wikimedia.org/wiki/Special:AbuseFilter/history (if you have access). @Steinsplitter I would appreciate if you could review them.

  • ‌ ‌ 29: updated
  • ‌ ‌ 31: updated
  • ‌ ‌ 69: unchanged – this filter will start working after the patches are deployed, previously it didn't
  • 108: unchanged – this filter will start working after the patches are deployed, previously it didn't (it's not enabled though)
  • 110: unchanged – the throttle will now correctly only count actual uploads, not uploads to stash
  • 146: updated
  • 149: OK, unchanged
  • 152: unchanged – this filter will start working after the patches are deployed, previously it didn't
  • 153: unchanged – this filter will start working after the patches are deployed, previously it didn't

Thanks for looking into the other existing filters.

@matmarex Excellent, Thank You!

I am also very very happy that T89252 is fixed. Thanks again!

Everything seems to be working fine, filter 153 has rejected 1047 actions so far (https://commons.wikimedia.org/wiki/Special:AbuseLog?wpSearchFilter=153).

Filter 152 is unused and should probably be marked as deleted.

@Steinsplitter, I have a request for you: can you add a custom warning message to filter 153 with a more specific text? Right now the user gets the generic "This action has been automatically identified as harmful", which is pretty bad. It's also apparently causing the users to retry the upload multiple times after it fails, thinking it must be a fluke (look at the abuse log). And I put quite a bit of effort to make the custom messages appear in the upload dialog, see T137841 ;)

Yes, a warning is definitely needed.

Added a warning and marked the other filter as deleted. :-)

Thank you! For anyone else curious, here's what it looks like for the user hitting the filter:

pasted_file (1×1 px, 400 KB)

This effectively de feature upload from VisualEditor which is intended mainly for new users. If the upload dialog interface isn't good enough / doesn't instruct user properly this should be fixed in the upload dialog and not by abusing the abuse filter

This effectively de feature upload from VisualEditor which is intended mainly for new users. If the upload dialog interface isn't good enough / doesn't instruct user properly this should be fixed in the upload dialog and not by abusing the abuse filter

(This has also been filed as T166540: VisualEditor file upload is broken for new user causing loose of >0.5 million files, let's continue there.)