Page MenuHomePhabricator

Reorder order of file-move-checks to show blocking errors first
Closed, ResolvedPublic5 Estimated Story Points

Description

Motivation
Currently it may happen, that you are not allowed to move a file, but the error message you see first is about something fixable (and you will only be stopped from making the move, after you corrected the first issue. Ideally people are confronted with the hard bugs first.

Acceptance Criteria

  • Reorder the checks so that we reach the following order:
    1. Check if the user is allowed to perform a move an upload.
    2. Check if the file move is allowed to happen, e.g.
      1. File exists on Commons
      2. File extension and MIME type do not match
    3. Check if there are any solvable issues, e.g.
      1. Title has already been given
      2. Title includes forbidden characters
  • The AbuseFilter is currently run on submit. Since we are not clear yet, what the final interaction between it and FileImporter should look like, integrating the AbuseFilter will be handled later.

Notes
Currently run tests in current order:

  • runCommonsHelperChecksAndConversions does all CommonsHelper validations and cleanup.
  • runWikiLinkConversions prefixes and unlocalizes namespaces in wiki links.
  • runBasicTitleCheck is some basic sanity. This also reports if automatic changes have been made to the file name.
  • runPermissionTitleChecks checks if the user can upload files.
  • runFileExtensionCheck checks if file extensions are given, and match.
  • runDuplicateFilesCheck checks for duplicate uploads based on their SHA1 hash.
  • runFileTitleCheck checks if the file name is allowed on the target wiki.
  • runLocalTitleConflictCheck checks if a file with the same name already exists.
  • runRemoteTitleConflictCheck only runs in case the file was renamed. It checks if there is another file on the source wiki with this name.

Side note
Note it was impossible to fix this before https://gerrit.wikimedia.org/r/510589, because the two checks for CommonsHelper stuff and duplicates happened in two entirely different places, impossible to swap. Now it's really easy: ImportPlanValidator lists all checks.

Event Timeline

Lea_WMDE renamed this task from Misleading error message when trying to import a file from Wikimedia Commons to Reorder order of file-move-checks to show blocking errors first.May 29 2019, 6:55 AM
Lea_WMDE updated the task description. (Show Details)
Lea_WMDE set the point value for this task to 5.

Some steps were too difficult to reorder (in bold below), and others which were not discussed in the acceptance criteria are added (in parentheses). This reflect the state of the patch as implemented so far:

  1. File exists on Commons
  2. Check if the user is allowed to perform an upload.
  3. Check if the file move is allowed to happen (unrecoverable errors)
    1. (Check whether commonshelper config exists for the source wiki.)
    2. (File extension has been changed -- I don't know how this might happen.)
    3. (File is a duplicate.)
  4. Check if there are any solvable issues, e.g.
    1. Title includes forbidden characters, too long, etc.
    2. (Title can be automatically fixed -- title is fixed in this step)
    3. Title already exists locally
    4. (New title already exists on the source wiki)
  5. Validations only after submit
    1. File extension and MIME type do not match
    2. AbuseFilter

Please let me know whether this is good enough, or whether we should scope and estimate the more difficult changes.

Ran into something interesting. If the title is malformed, then several of the checks can't run at all, for example "Check if the user is allowed to perform a move / upload" only makes sense in the context of a target filename, so we can't do this before the filename is at least hypothetically valid.

Shall we move the basic validity check back to the top again?

Change 513615 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/FileImporter@master] [WIP] Reorder validations so that unrecoverable errors come first

https://gerrit.wikimedia.org/r/513615

(unassigning myself so anyone can take a look at this)

summarizing daily: For now we are ok with "good enough". No further scoping necessary :)

Change 513615 merged by jenkins-bot:
[mediawiki/extensions/FileImporter@master] Reorder validations so that unrecoverable errors come first

https://gerrit.wikimedia.org/r/513615

Very little changed in this patch. One way to verify would be to provide a title which is automatically rewritten--I think leading whitespace should trigger this logic? With the newly patched code, the "no permission to upload" error will take precedence even if the title will be rewritten.

I had a quick look at the patch, and it's a nice and helpful change. Unfortunately it seems to miss one thing. Why can't we run all these unrecoverable errors (user isn't allowed to upload files, bad file extension, duplicate file already exists) before the wikitext conversions? That's what I originally reported: The error message that asks to create a config page for the source wiki (that's a recoverable error!) should not appear before any unrecoverable one. Is there a technical reason this isn't possible?

Change 514308 had a related patch set uploaded (by Thiemo Kreuz (WMDE); owner: Thiemo Kreuz (WMDE)):
[mediawiki/extensions/FileImporter@master] Run recoverable CommonsHelper checks after unrecoverable errors

https://gerrit.wikimedia.org/r/514308

Change 514308 merged by jenkins-bot:
[mediawiki/extensions/FileImporter@master] Run recoverable CommonsHelper checks after unrecoverable errors

https://gerrit.wikimedia.org/r/514308

Lea_WMDE moved this task from Demo to Done on the WMDE-QWERTY-Sprint-2019-05-29 board.