- Look at grafana board for occuring abuse filter errors on import
- Create overview of what these errors mean and how often they appear so we can discuss them
Description
Related Objects
Event Timeline
Overview over all errors we log ( defaults to the last 30days ):
https://grafana.wikimedia.org/d/000000553/mediawiki-fileimporter?panelId=25&fullscreen&orgId=1
List of errors over the last 365 days ( with >23 K successful uploads ):
file_missing_required_template | 2.2 K |
userPermissionsError | 1.9 K |
duplicateFiles | 901 |
cantimportfilehidden | 356 |
file_contains_blocked_category_template | 300 |
commonshelper_missing_config | 172 |
abusefilter_warning_otrs | 109 |
operationCommit | 88 |
404 | 81 |
userBlocked | 72 |
cantimportfromsharedrepo | 60 |
filetype_mime_mismatch | 38 |
commonshelper_parsing_failed | 37 |
abusefilter_warning_blanking2 | 34 |
api_badinfo | 33 |
abusefilter_warning_copyv2 | 30 |
cantimporturl | 23 |
spam_blacklisted_link | 22 |
userGloballyBlocked | 21 |
abusefilter_disallowed | 21 |
uploaded_href_unsafe_target_svg | 18 |
1_2 | 18 |
filemissinginrevision | 12 |
0 | 12 |
uploaded_href_attribute_svg | 11 |
abusefilter_warning_review | 11 |
upload_scripted_dtd | 7 |
filenameerror_notallowed | 7 |
400 | 6 |
revisionMissingField | 5 |
filetoolarge | 5 |
api_toomanyrevisions | 5 |
abusefilter_warning_use_delete_gadget | 5 |
tiff_bad_file | 3 |
cantparseurl | 3 |
cantimportmissingfile | 3 |
api_failedtogetinfo | 3 |
abusefilter_warning_mp3 | 3 |
uploadinvalidxml | 2 |
api_nopagesreturned | 2 |
noSourceApiFound | 1 |
noNullRevisionCreated | 1 |
filenameerror_noplannedextension | 1 |
Detailed look into the AbuseFilter rules.
abusefilter_warning_otrs | 109 | Filter 69 - Adding OTRS permission by non-OTRS member - This rule checks if OTRS tempaltes are only added by users allowed. | Should just give a warning and add a tag. | Makes no sense on older revisions but in the current. See also T213409 |
abusefilter_warning_blanking2 | 34 | Filter 4 - Page blanking - If there's for some reason an emptied page the import is completely blocked. | Blocks the import completly. | Makes no sense on older revisions. |
abusefilter_warning_copyv2 | 30 | Filter 154 - Possible copyvio - Possible copyright violations. Checks for specific triggerwords in the wikitext ( e.g. getty, shutterstock ). | Should just give a warning. | Makes no sense on older revisions |
abusefilter_disallowed | 21 | Unspecified rule. - One or more rules that we currently can/do not distinguish in the logging. | ||
abusefilter_warning_review | 11 | Filter 70 - License review by non-Image-reviewers - I don't have permissions to see that filter in detail but I'm quite confident that that's the filter for that error. I guess this is about the "usage" of a template that not all users are allowed to use. So if you import a file with that template the rule is triggered. - I guess it would be save to have this disabled for old revisions. | ||
abusefilter_warning_use_delete_gadget | 5 | Filter 71 - Recommend the "Nominate for Deletion" gadget - Shows a hint to the user in cases where the file description seems to contain text where you could assume the user wants to suggest a deletion. | Should just give a warning and add a tag. | Makes no sense on older revisions. |
abusefilter_warning_mp3 | 3 | Filter 192 - Restrict MP3 uploads Uploading MP3 files is generally forbidden. | Blocks the import completely. | Makes sense on older revisions. |
The rules that should just give a warning currently not working as intended. It seems that the FileImporter blocks these imports completely. The expected behavior would be to show a warning and then allow clicking the submit button again and continue with the import. - If I remember correctly, this worked at some point. I think with the introduction of the separate error page - that does not allow submitting again - this workflow is broken.
In some of these cases additionally to the warning there should be added a tag. This is also not done by the FileImporter. I think this never worked.
abusefilter_warning_copyv2 […] Makes no sense on older revisions
Oh, take care. A copyright violation in an older revision is still a copyright violation. It's accessible for everybody. Wikimedia hosts it. Wikimedia can be sued. Or worse: The original uploader can be sued.
Normally old versions with copyright violations are suppressed. Then there should be no need to handle them specially/different than other cases of files with suppressed versions.
Maybe one word to that. The rule in question is just a very rough attempt to detect possible violations early. It just gives the user a warning and still does allow the import in a second step. Also it should be considered that the source file in question would need to have a template added to it, that suggests having a compatible license before the user even gets to the import that would trigger the warning.
So even if we would ignore the warning on old revisions ( what we will not do ), there are a few steps of people doing things wrongly before the FileImporter would be a tool that was a piece in getting anyone sued. :-)
The investigation is done I created three follow up tickets for next steps. Probably not all of them will be implemented right away.
T253872: Respect AbuseFilter warnings triggered by imports
T253874: Add tags to the import if triggered by the AbuseFilter
T253876: Offer variables that surface the import steps to the AbuseFilter