Page MenuHomePhabricator

Set up a "file dropbox" or similar for temporary storage of files pending server-side upload
Closed, DeclinedPublicFeature

Description

Larger file uploads are becoming more common, and due to various restrictions, they can't be directly uploaded.

We should really come up with some way people can upload and dump large files for us, and we can grab them from there, rather than them having to have large amounts of 3rd party storage for us to wget from


Version: unspecified
Severity: enhancement
URL: https://commons.wikimedia.org/wiki/Help:Server-side_upload

Details

Reference
bz31828

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 11:56 PM
bzimport added a project: Cloud-VPS.
bzimport set Reference to bz31828.

What's ETA on chunked uploads, which presumably obsoletes that? Any tasks that need to be done to finish or review it prior to deployment?

(People could use, say, DropBox.com right now to do this.)

(In reply to comment #1)

What's ETA on chunked uploads, which presumably obsoletes that? Any tasks that
need to be done to finish or review it prior to deployment?

There was talk of deploying it this week, until it was noticed not all code was merged....

Where has it not been merged from? Is there a branch or something?

(In reply to comment #4)

Where has it not been merged from? Is there a branch or something?

The code for it is in trunk, but hasn't made it into 1.18wmf1 yet etc

I'd support a public dropbox folder, if that's the best alternative we have. The arbitrary size limitation on Commons and the lack of well-developed tools to make high quality contributions a possibility is rather ridiculous imo :(

Not going to happen I think (as with bug 31984) and completely unclear use case.

(In reply to comment #1)

What's ETA on chunked uploads, which presumably obsoletes that?

Deployed.

I don't agree with Nemo's bug resolution.

(1) Despite chunked uploads, we still have upload request bugs

(2) Fastily today has still an issue

(3) We now have a Labs infrastructure.

If I've a feu vert to use Labs for this use, I'm willing to create an instance to allow such uploads in the commons-dev project: with 3 features:

  • like all media agencies do - a public FTP to upload files, with the password given to every interested people
  • for power users working in console world - the possibility to request a shell account socially restricted to use wget/fetch/curl-like commands.
  • a web download form allowing to instruct the server to download a remote file

[ Adding Ryan Lane in cc. Ryan > would this be acceptable to use a m1.xsmall instance for this use? ]

Clarified summary for what appears to be the use case according to Dereckson: it's an extremely limited one, not a general-purpose "file dropbox" as the previous summary suggested.
[Personal opinion: archive.org has everything you need.]

Nemo pointed to me on IRC this bug missed "any actual description [of the service, what purpose it would serve and how it would look like".

Use case

Commons contributors have sometimes to upload big files. They can't do that on Commons, whether their own bandwidth capability doesn't allow that, whether the file is greater than 500 Mb.

In such cases, the contributor files a bug on Bugzilla giving the file URL.

Sometimes, some users have problems to store files.

Alternative ways are to use services like Vimeo, Dropbox (Nemo also suggested to use archive.org) but they all have drawbacks, the least being to have to review, analyse and accept a 3rd party agreement, the worst being usability and space or bandwidth limitation.

Rationale

  • Would ease the life of contributors
  • Would allow niceties like a bug publishing wizard.
  • Would increase Commons participation

Service description

In a nutshell

A transitional service to host large files having to be published on Wikimedia Commons.

Phase 1

  • Public FTP protected by a semi public password, given to everyone having to upload a file (the only goal is to restrict people scanning IP ranges ; that works very well with graphics or media agencies, VICE magazine for example publishes on the Ads section their password in a specifications PDF, without any major issue).
  • Shell access for power users
  • Web form to download file from remote place

Socially, FTP and shell access are restricted to download files to be published on Wikimedia Commons.

Phase 2

  • Create a dedicated Bugzilla account to allow users not having one to use this service (it would rely on TUSC authentication) through a bug publishing wizard. It would fill a bug with the upload request, a standardized format.

I definitely strongly support the creation of a labs instance with a public ftp. That would greatly simplify the large file contribution process, not only for me, but for editors in general.

On a related note, chunked uploads is still broken and waiting on bug 36587.

(In reply to comment #10)

  • Would increase Commons participation [...]

Phase 2

  • Create a dedicated Bugzilla account to allow users not having one to use this

service (it would rely on TUSC authentication) through a bug publishing wizard.
It would fill a bug with the upload request, a standardized format.

Of course this part belongs to another bug or new project proposal or whatever: it assumes WMF sysadmins will process an unlimited amount of server-side upload requests, which is not the case right now.

This honestly terrifies me from a legal POV. I honestly don't think I can say yes on this without talking to our legal team first.

(In reply to comment #13)

This honestly terrifies me from a legal POV. I honestly don't think I can say
yes on this without talking to our legal team first.

Prioritising getting bug 36587 fixed as an chunked upload/upload stash blocker would be a good idea...

Is this still relevant?

I'll echo Ryan's comments about legal approval being required.

yuvipanda raised the priority of this task from Low to Needs Triage.Nov 24 2014, 1:49 PM
yuvipanda subscribed.

why is FTP actually needed? as soon as an SSH is running anyways you can already SCP and there are GUI clients for SCP (WinSCP et al) that make the user experience pretty much identical to using an FTP client while you don't have to run any additional service

The simplest solution here is probably to let the cluster access some wmflabs.org domain, which can then also be added to whitelisted upload_by_url sources for GWT etc.

is this still an ongoing blocker for people?

Quick poll on IRC wikimedia-commons shows this is still needed, but not with an high priority.

Maybe we should try ownCloud. :D

Maybe we should try ownCloud. :D

I had the same thought. owncloud is free software and has that exact dropbox-like functionality.

chasemp set Security to None.
Aklapper changed the subtype of this task from "Task" to "Feature Request".Feb 4 2022, 11:00 AM
Aklapper removed a subscriber: chasemp.
taavi subscribed.

I don't t hink this is relevant anymore, and I feel like this would have the exact same problems with uploading large files that it's trying to work around.