When the target wiki has the abuse filter enabled, use it
Closed, ResolvedPublic2 Story Points

Description

Motivation
Files uploaded through the file importer should adhere to current standards. Therefore they should be able to pass the abuse filter, if the target wiki (such as Commons) has one

Task
If appropriate, we should:
When saving the fileImporter specialPage, the imported file should pass the abuse filter.

What makes it difficult is that we need to enter the old revisions in order, i.e. the current file revision needs to be the last one.Old revisions should also trigger the abuse filter

Related Objects

Lea_WMDE created this task.Mar 15 2017, 4:21 PM
Andrew-WMDE removed Andrew-WMDE as the assignee of this task.Oct 30 2017, 4:12 PM
Andrew-WMDE added a subscriber: Andrew-WMDE.
Lea_WMDE renamed this task from Investigate and use the abuse filter for the FileImporter if appropriate to When the target wiki has the abuse filter enabled, use it.Nov 2 2017, 9:03 AM
Lea_WMDE updated the task description. (Show Details)
Tobi_WMDE_SW set the point value for this task to 21.Feb 13 2018, 4:02 PM
Tobi_WMDE_SW added a subscriber: Tobi_WMDE_SW.

We probably need to distinguish between images and text.

Lea_WMDE moved this task from Todo to Sprint ready on the WMDE-QWERTY-Team board.Mar 6 2018, 11:36 AM
WMDE-Fisch claimed this task.

So when it comes to uploads, the AbuseFilter extension hooks itself into the UploadVerifyUpload hook, that allows stuff do be checkd on the file, text, edit and edit summary of an upload in the UploadBase class. These are the checks that IMO should be also performed for the imported file and text revisions. And as I see it they are not at the moment because the only place this hook is called is in UploadBase::performUpload. I will try to call that hook at corresponding places in our code.

The AbuseFilter has certain text it provides when it took action. - E.g. When I run the hook on our file uploads, and the filter triggers that blocks the upload, I get an error text from it that looks like that:

This action has been automatically identified as harmful, and therefore disallowed.
If you believe your action was constructive, please inform an administrator of what you were trying to do.
A brief description of the abuse rule which your action matched is: Upload Large Test

Upload Large Test is the name of my filter here ( set via the SpecialPage from AbuseFilter to configure filters on wiki. e.g. see https://commons.wikimedia.org/w/index.php?title=Special:AbuseFilter )

Questions:

  • Do we want to expose that message to the user as it is?
  • Do we want to replace it with something genereic? ( e.g. some filter rules are in place that prevent your upload. )
  • Do we want to expose it and add some additional message?

@Lea_WMDE @Charlie_WMDE

Thanks @WMDE-Fisch I'm adding @Hanna_Petruschat_WMDE since she is the new UX lead for the FileImporter.
One question: When will this be triggered? The moment that the user hits the "import" button, or when they click the link?
And one reason why we might want to change the message is that we have multiple revisions that can be affected, whereas usually there is only one. Fisch, do we know which revision is affected specifically?

Sooo for the time being, I would trigger this the moment the user hits "Import". - At least for now and for the FileRevisions, because at that moment we have a similar state of the data like when the UploadVerifyUpload Hook is called in UploadBase and we also do other final checks there. - But I assume that could be changed. - We could also have data about the revsion affected, since we loop through them checking each separately.

Even if we might want to move the checks to the start we still would need at least one final check for the text in cases it got changed.

Change 417307 had a related patch set uploaded (by WMDE-Fisch; owner: WMDE-Fisch):
[mediawiki/extensions/FileImporter@master] [WIP] Include Hooks relevant for AbuseFilter

https://gerrit.wikimedia.org/r/417307

... and thinking about it, if you just display the error messages somewhere for now, I guess it would be best. We'll have another ticket about improving error messages anyways, and then this ticket won't be blocked.

Tobi_WMDE_SW changed the point value for this task from 21 to 2.Mar 20 2018, 4:06 PM

Change 417307 merged by jenkins-bot:
[mediawiki/extensions/FileImporter@master] Include validation steps relevant for AbuseFilter

https://gerrit.wikimedia.org/r/417307

So I managed to get an error: Using https://commons.wikimedia.org/wiki/File:ADAC-Zentrale,_Munich,_March_2017-04.jpg I first added "kitten papaya" here:

`=={{Assessment}}==
{{picture of the day|year=2018|month=03|day=28}}
{{Assessments|featured=1|quality=1}}

kitten papaya`

and when hitting "import" I received both warnings in one box, with no separation (Is this something that we style? If yes, we should seperate them somehow to make it visible that it is two seperate issues).
Then I went back and removed "papaya". Now, when hitting "import", I received

. However, using the backbutton multiple times, submitting the kitten only again and then hitting import resulted in making this work again.

... also the error message itself looks broken, such as the kitten message:
'''Warning:''' This action has been automatically identified as harmful. Unconstructive actions will be quickly reverted, and egregious or repeated unconstructive editing will result in your account or IP address being blocked. If you believe this action to be constructive, you may submit it again to confirm it. A brief description of the abuse rule which your action matched is: <div class="mw-parser-output"><p>Mark all Kittens </p></div>

Ok after looking into it, there really is something different when two messages should appear. I will fix that. On the other hand, the last part of the text seems to be broken in any case and seems not related to anything we do in particular ... see the screenshot below:


(trying to add "papaya kitten" to a normal text on that wiki)

I will look into that and might create a ticket so it can be fixed in core/abusefilter

Change 422423 had a related patch set uploaded (by WMDE-Fisch; owner: WMDE-Fisch):
[mediawiki/extensions/FileImporter@master] Use parsed HTML for generated validation error texts

https://gerrit.wikimedia.org/r/422423

Change 422423 merged by jenkins-bot:
[mediawiki/extensions/FileImporter@master] Use parsed HTML for generated validation error texts

https://gerrit.wikimedia.org/r/422423

I created a ticket for the above issue with HTML showing: T190942

I created a ticket for the above issue with HTML showing: T190942

Got merged into T173249 patch is up at https://gerrit.wikimedia.org/r/418584 works for our case and has +1s already.

so should I test again? Then please move it to the column :)

Oh, right the other patch got merged already ... sure ... but please ignore the <html> tags ;-).