Page MenuHomePhabricator

Upload-by-URL should enforce $wgMaxUploadSize early when Content-Length header provided
Closed, ResolvedPublic

Description

Currently upload-by-URL enforces $wgMaxUploadSize by counting up bytes as it downloads, then aborting when it reaches the maximum.

This could potentially be a long time... CURL should be able to give us the HTTP header values including any Content-Length header that may have been provided long before we get to this point, in which case we could abort immediately.

Ideally this ability should fold into Http class.


Version: unspecified
Severity: enhancement
URL: http://test.wikipedia.org/wiki/Special:Upload

Details

Reference
bz18201

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 10:33 PM
bzimport set Reference to bz18201.

mdale wrote:

working on this in the new-upload branch... I will try and add in a head request using the http class to do some early detection but in cases where we don't have a content length in the http header we will have to count bytes as it downloads.

Also the architecture has to change a bit... we have to spin off the action into a separate command line php process to monitor the curl copy and update the memchached (or database if memchace not installed) then the client does ajax requests and gets updates as to how far along the transfer is.. the spin off process then actually creates the resource page and informs the client its ready. Will keep things insync passing the session key to the process (unless that is a bad way) in which case what would be a good way?

We have to spin it off into a separate process cuz our php execution times out in 30 seconds.

This also involves rewriting the Special:Upload page for http requests and doing a little ajaxy interface for progress. (can use the same ajax progress indicator interface that we are using for firefogg upload progress).

But that hits on the same theme of getting jQuery into core. Which will speed up the interfaces for all these enhancements.

mdale wrote:

(fixed) we first do a HEAD request then if less than $wgMaxUploadSize byte size we continue. We then use the CURL writeBodyCallBack function to write to a file .. if the file grows > $wgMaxUploadSize we break out as well.

That should work great for most static files. :D

I don't think Content-Length is a required header though; if it's not present, the current code in branch will spew a notice error at "if($head['Content-Length'] > $wgMaxUploadSize){"

mdale wrote:

fixed Content-Length a while back (just catching up on backlog)

The pre-download checks also looks for redirects makes sure they are a valid url and enforces the $wgMaxRedirects global variable

mdale wrote:

fixed. Also the curl headers was added in r53620