Page MenuHomePhabricator

Investigate prerequisite checks of Commons File Upload page and add them to the File Importer, too
Closed, ResolvedPublic

Description

Motivation
Files that are moved with the FileImporter should adhere to Commons standards.

Task
Investigate which prerequisistes are set when using the Commons file upload page and implement the same standards for the File Importer. This means only background checks that the user would normally not be aware of.

Event Timeline

@Lea_WMDE Does this primarily mean things such as max file size & allowed file types or are we thinking about a different set of prerequisite checks?

It means that we should do everything that would be done if the file was uploaded through the "traditional" uploading means of commons through
https://commons.wikimedia.org/w/index.php?title=Special:Upload
or
https://commons.wikimedia.org/wiki/Special:UploadWizard
I guess that they will check for max file size and allowed file types, but please make sure they are not checking for anything else

Lea_WMDE moved this task from Proposed to Todo on the WMDE-QWERTY-Team board.Oct 30 2017, 12:09 PM

Requirement 1: "File was uploaded through the "traditional" uploading means of Commons"
https://www.mediawiki.org/wiki/Manual:Configuring_file_uploads

We could essentially just read the following globals set in LocalSettings.php and implement our own checks in FileImporter.
For allowed file types: https://www.mediawiki.org/wiki/Manual:Configuring_file_uploads#Configuring_file_types
For max file size: https://www.mediawiki.org/wiki/Manual:Configuring_file_uploads#Set_maximum_size_for_file_uploads
e.g. If a user tries to upload a file with an excluded file extension we simply throw an exception.

Requirement 2: "File was uploaded using the UploadWizard of Commons"
https://www.mediawiki.org/wiki/Extension:UploadWizard#Other_configuration

The only checks of interest I can see UploadWizard is carrying out are:

  1. 'maxUploads' -> not applicable in FileImporter since we are only importing one image at a time
  2. 'fileExtensions' -> handled by global $wgFileExtensions see Requirement 1

Requirement 1: "File was uploaded through the "traditional" uploading means of Commons"
https://www.mediawiki.org/wiki/Manual:Configuring_file_uploads
We could essentially just read the following globals set in LocalSettings.php and implement our own checks in FileImporter.
For allowed file types: https://www.mediawiki.org/wiki/Manual:Configuring_file_uploads#Configuring_file_types

So types are checked in UploadBase @ https://phabricator.wikimedia.org/source/mediawiki/browse/master/includes/upload/UploadBase.php;847bc2b8d7756c36551a768f5cf0d36bc50705b6$981-987
This is in the getTitle method which we call in ValidatingUploadBase::validateTitle @ https://phabricator.wikimedia.org/source/mediawiki-extensions-FileImporter/browse/master/src/Services/UploadBase/ValidatingUploadBase.php;6dfb9ab493e44e1185a4828c1f8bba84d72bfc75$46

For max file size: https://www.mediawiki.org/wiki/Manual:Configuring_file_uploads#Set_maximum_size_for_file_uploads
e.g. If a user tries to upload a file with an excluded file extension we simply throw an exception.

It looks like right now we only check the file size on actual download / fetch of the file.
FileChunkSaver is constructed with a max number of bytes to save https://phabricator.wikimedia.org/source/mediawiki-extensions-FileImporter/browse/master/src/Services/Http/FileChunkSaver.php;a24cd8c08d8bbe035cd94ba04207ed4f5c17c41b$49
and the ServiceWiring passes in the max upload size https://phabricator.wikimedia.org/source/mediawiki-extensions-FileImporter/browse/master/src/ServiceWiring.php;a24cd8c08d8bbe035cd94ba04207ed4f5c17c41b$43

We should probably also check this earlier, before showing the preview page, as the API response we get should contain the size of each file revision.
I have created T182062: Check each file revision size against max allowed before previewing to track this.

Requirement 2: "File was uploaded using the UploadWizard of Commons"
https://www.mediawiki.org/wiki/Extension:UploadWizard#Other_configuration
The only checks of interest I can see UploadWizard is carrying out are:

  1. 'maxUploads' -> not applicable in FileImporter since we are only importing one image at a time

Yup, we don't need to worry about this one.

  1. 'fileExtensions' -> handled by global $wgFileExtensions see Requirement 1

Same as above

Addshore closed this task as Resolved.Dec 5 2017, 9:04 AM
Addshore claimed this task.
Restricted Application added a project: User-Addshore. · View Herald TranscriptDec 5 2017, 9:04 AM
Addshore moved this task from Todo to Done on the WMDE-QWERTY-Team board.Dec 5 2017, 10:50 AM
Tobi_WMDE_SW moved this task from Done to Demoed on the WMDE-QWERTY-Team board.Feb 20 2018, 4:53 PM