Deploying FileExporter and FileImporter
Open, NormalPublic

Description

Why both extensions together?
Although FileExporter and FileImporter are technically two extensions, we want to treat them as one (for the users at least) in beta. For the beta, the FileExporter should be deployed on Wikipedias, setting a link to the FileImporter. The FileImporter should be a Commons extension, but is only meant to be used through the link of the FileExporter, i.e. moving the file where the link was clicked.

Info on User, Control and Dataflow
can be found in the README, e.g. here: https://github.com/wikimedia/mediawiki-extensions-FileImporter/blob/master/README.md

Testing the FileImporter
You can visit https://test.wikimedia.beta.wmflabs.org/wiki/Special:ImportFile, and then use e.g. the Commons picture of the day to test the import.
Please note: The first page, where you input a url is only there for testing reasons and will not be present when the FileImporter is deployed. Furthermore, importing from Commons is only possible in this test setup, not once it is live.

Needed for beta deployment of the extensions:

  • Security Review for FileExporter: Done in T158661
  • Security Review for FileImporter: Done in T160982
  • Design Review: All designs come from WMDE-Design
  • Performance Review: Requested, and then cancelled because not necessary anymore
  • Deploy on beta sites: Done in T181383
  • Deploy on test wikis: Scheduled for June 12th
  • Deploy on de-wiki,ar-wiki and fa-wiki: Scheduled for June 25th

Community Consensus
These extensions are being developed to fulfil a wish of the 2013 de-wiki technical wishes survey. See here for more info on the project.

Related Objects

Lea_WMDE created this task.Mar 26 2018, 7:19 PM
Restricted Application added a project: TCB-Team. · View Herald TranscriptMar 26 2018, 7:19 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Change 422183 had a related patch set uploaded (by Thiemo Kreuz (WMDE); owner: Thiemo Kreuz (WMDE)):
[mediawiki/extensions/FileImporter@master] [WIP] Describe command and data flow in README

https://gerrit.wikimedia.org/r/422183

Lea_WMDE triaged this task as Normal priority.Apr 18 2018, 1:48 PM
Lea_WMDE updated the task description. (Show Details)

Operations is there anything from your side that we should take care of, moving these two extensions forward?

Lea_WMDE updated the task description. (Show Details)Apr 18 2018, 1:54 PM
Lea_WMDE updated the task description. (Show Details)

@Joe @fgiunchedi Would probably be a good idea if one/both of you took a look at this, just to be sure that the choices being made around where to put temp files, limits on size and quantity, etc are all reasonable.

@Gilles We decided this didn't really need a perf review, but if you have a few minutes free it couldn't hurt to look things over.

Joe added a comment.Wed, May 23, 5:47 AM

Looking through the tickets and the Readme of the extension, I think the default configurations are somewhat sane, but I'd reduce the total size of the images that can be fetched probably.

I don't think there is time to change the way the extension works before deployment, but I think we would have better performance and a more sound architecture if we used Swift's COPY function to copy images to their final destination in commons; @Lea_WMDE do you think that's a direction we can explore for a future enhancement?

A Swift COPY is possible, but would require exposing sharding information of all wikis to a given wiki's config. Right now each wiki only gets information about its own sharding. It also requires significant refactoring or trickery in the SwiftFileBackend family of classes, which is currently architected around dealing with one wiki-specific "FileBackend" at a time.

Looking through the tickets and the Readme of the extension, I think the default configurations are somewhat sane, but I'd reduce the total size of the images that can be fetched probably.

Would you want the limit reduced now already? We plan to deploy to de-wiki and one RTL-wiki as a first step, and will log the sizes people will actually try to move. We would have waited until we have some data to reevaluate the limits, but if you feel we should start with another size limit, let me know which one it should be :)

I don't think there is time to change the way the extension works before deployment, but I think we would have better performance and a more sound architecture if we used Swift's COPY function to copy images to their final destination in commons; @Lea_WMDE do you think that's a direction we can explore for a future enhancement?

Generally we should invest time to get better performance and a more sound architecture. However, I propose to reevaluate that topic once we have more data on how much load is actually created. @Addshore looked into using Swift's COPY function quite a bit already, but without success. So before we look into it again, we should have some more discussions with you, him, @Gilles and other knowledgeable people to make sure this can actually work. I suggest we do that once we feel like it is worth to put in the effort in comparison to the load that is being created.

Joe added a comment.Wed, May 23, 4:33 PM

Shortly:

  • It's ok to reevaluate the limits after we deploy, 250 MB is not a huge amount of data and will save us from some of the most absurd situations we have, like huge tiff or djvu collections being moved and killing an appserver and/or swift in the process.
  • I'm ok with the approach taken for now, and I think it's fair to want to know how much this feature will actually be used before pouring more resources into it. I just intuitively assumed it would be not such a daunting task, but I get why it's not straightforward to do now that I looked at the code for SwiftFileBackend.

Change 436356 had a related patch set uploaded (by Thiemo Kreuz (WMDE); owner: Thiemo Kreuz (WMDE)):
[mediawiki/extensions/FileImporter@master] [WIP] Add documentation about throttling and alternative backends

https://gerrit.wikimedia.org/r/436356

JStrodt_WMDE updated the task description. (Show Details)Wed, Jun 6, 1:30 PM
JStrodt_WMDE updated the task description. (Show Details)
He7d3r added a subscriber: He7d3r.Mon, Jun 11, 10:31 PM
Lea_WMDE updated the task description. (Show Details)Wed, Jun 13, 7:14 AM

Change 436356 merged by jenkins-bot:
[mediawiki/extensions/FileImporter@master] Add documentation about throttling and alternative backends

https://gerrit.wikimedia.org/r/436356

Gilles removed a subscriber: Gilles.Mon, Jun 18, 12:18 PM