Page MenuHomePhabricator

Uploading a big PDF file failed
Closed, ResolvedPublic

Description

Following the problem reported here https://commons.wikimedia.org/wiki/User_talk:Rillke/Discuss/2019#Chunk_upload_not_working
I am trying to upload the PDF file from https://archive.org/details/20190823_20190823_0935 (473 MB).
I want to upload_by_url over https://commons.wikimedia.org/wiki/File:%E0%A4%B6%E0%A4%BF%E0%A4%B2%E0%A5%8D%E0%A4%AA%E0%A4%95%E0%A4%BE%E0%A4%B0_%E0%A4%9A%E0%A4%B0%E0%A4%BF%E0%A4%A4%E0%A5%8D%E0%A4%B0%E0%A4%95%E0%A5%8B%E0%A4%B6_%E0%A4%96%E0%A4%82%E0%A4%A1_%E0%A5%AD_-_%E0%A4%9A%E0%A4%BF%E0%A4%A4%E0%A5%8D%E0%A4%B0%E0%A4%AA%E0%A4%9F,_%E0%A4%B8%E0%A4%82%E0%A4%97%E0%A5%80%E0%A4%A4.pdf but I repeatedly get an error (I tried at least 10 times).
I also tried to upload as a new file: idem.
Final name should be File:शिल्पकार चरित्रकोश खंड ७ - चित्रपट, संगीत.pdf

Event Timeline

Yann created this task.Fri, Aug 23, 8:49 PM
Restricted Application added a project: Internet-Archive. · View Herald TranscriptFri, Aug 23, 8:49 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
MarcoAurelio added a subscriber: MarcoAurelio.

Maybe it's because the file is too big to upload via the UI. Tagging site-requests so they can evaluate if this needs to be done via a server-side upload.

It should be fine to upload a file that's below one GB. Trying myself, will post update.

There are, apparently, some codfw issues or so. See -operations for details. It looks there are some problems.

Tried via mwdebug1001, so I can distinquish various attempts from each other.

mwlog1001:/srv/mw-log/apache2.log
Aug 23 22:00:31 mwdebug1001: [proxy_fcgi:error] [pid 30875:tid 139725847308032] [client 10.64.0.53:36798] AH01067: Failed to read FastCGI header, referer: https://commons.wikimedia.org/wiki/Special:Upload
Aug 23 22:00:31 mwdebug1001: [proxy_fcgi:error] [pid 30875:tid 139725847308032] (104)Connection reset by peer: [client 10.64.0.53:36798] AH01075: Error dispatching request to : , referer: https://commons.wikimedia.org/wiki/Special:Upload

It may or may not be related to what @MarcoAurelio says, tagging Operations to judge on that.

Also, ftr, posting the error message:

It may or may not be related to what @MarcoAurelio says, tagging Operations to judge on that.

(althrough given I used mwdebug1001.eqiad.wmnet, it should go through eqiad datacenter, or not?)

Yann added a comment.Sat, Aug 24, 7:15 AM

Maybe it's because the file is too big to upload via the UI. Tagging site-requests so they can evaluate if this needs to be done via a server-side upload.

Well, the interface says

Maximum file size: 4 GB

so there shouldn't be any problem, unless this information is wrong.

Regards,
Yann Foget

CDanis added a subscriber: CDanis.

Thanks for this report!

The codfw issues were mitigated as of around 21:55 UTC Friday -- so if the issue still persists, something else is at fault.

I've just tried it again, to be sure, and it still fails with "Service Temporarily Unavailable". Second try failed with different error, screenshot:

Request from redacted via cp1079 cp1079, Varnish XID 452002588
Error: 503, Backend fetch failed at Sat, 24 Aug 2019 15:02:30 GMT

while mwlog1001:/srv/mw-log/apache.log still says

mwlog1001:/srv/mw-log/apache.log
Aug 24 15:03:27 mwdebug1001: [proxy_fcgi:error] [pid 27810:tid 139725855700736] [client 10.64.0.53:39350] AH01067: Failed to read FastCGI header, referer: https://commons.wikimedia.org/wiki/Special:Upload
Aug 24 15:03:27 mwdebug1001: [proxy_fcgi:error] [pid 27810:tid 139725855700736] (104)Connection reset by peer: [client 10.64.0.53:39350] AH01075: Error dispatching request to : , referer: https://commons.wikimedia.org/wiki/Special:Upload

I've also found PHP Fatal Error: entire web request took longer than 200 seconds and timed out in logstash.

Joe added a subscriber: Joe.Mon, Aug 26, 6:51 AM

A file of 473 MB surely goes over the large file limits unless something changed recently.

https://commons.wikimedia.org/wiki/Help:Server-side_upload still seems to suggest you should request that. Untagging Operations as I don't think there is anything SRE should do here.

Yann added a comment.Mon, Aug 26, 7:00 AM

A file of 473 MB surely goes over the large file limits unless something changed recently.
https://commons.wikimedia.org/wiki/Help:Server-side_upload still seems to suggest you should request that. Untagging Operations as I don't think there is anything SRE should do here.

All of upload_by_url, bigChunkedUpload and the UploadWizard are supposed to allow uploads up to 4 GB. If that's not the case, please say it clearly and fix the documentation. Otherwise, this is a bug.

Joe added a comment.Mon, Aug 26, 7:42 AM

A file of 473 MB surely goes over the large file limits unless something changed recently.
https://commons.wikimedia.org/wiki/Help:Server-side_upload still seems to suggest you should request that. Untagging Operations as I don't think there is anything SRE should do here.

All of upload_by_url, bigChunkedUpload and the UploadWizard are supposed to allow uploads up to 4 GB. If that's not the case, please say it clearly and fix the documentation. Otherwise, this is a bug.

Sure, but timeouts are a known problem, and there is a bug for that, clearly indicated in the page I linked to: T118887

While I would be happy if that bug was resolved, it certainly isn't either new or undocumented. I think for the goal of this ticket you probably need a server-side upload.

Yann added a comment.Mon, Aug 26, 9:28 AM

All of upload_by_url, bigChunkedUpload and the UploadWizard are supposed to allow uploads up to 4 GB. If that's not the case, please say it clearly and fix the documentation. Otherwise, this is a bug.

Sure, but timeouts are a known problem, and there is a bug for that, clearly indicated in the page I linked to: T118887

On the Commons page linked above, सुबोध कुलकर्णी (Subodh) mentioned that ChunkedUpload doesn't work either, while I did upload many PDF files much bigger than that without a problem:

So this is something new.

All of upload_by_url, bigChunkedUpload and the UploadWizard are supposed to allow uploads up to 4 GB. If that's not the case, please say it clearly and fix the documentation. Otherwise, this is a bug.

Sure, but timeouts are a known problem, and there is a bug for that, clearly indicated in the page I linked to: T118887

On the Commons page linked above, सुबोध कुलकर्णी (Subodh) mentioned that ChunkedUpload doesn't work either, while I did upload many PDF files much bigger than that without a problem:

So this is something new.

Not necessarily, just doesn't happen everytime :). Were those files uploaded via upload_by_url? If so, was it using standard iface at Special:Upload, or some gadget/userscript (one of the files mentions https://commons.wikimedia.org/wiki/User:Rillke/bigChunkedUpload.js).

Urbanecm claimed this task.Mon, Aug 26, 2:55 PM

Going to server side upload this one. If more trouble files are needed, please create a new task according to https://commons.wikimedia.org/wiki/Help:Server-side_upload. I'll happily upload them.

Restricted Application added a project: User-Urbanecm. · View Herald TranscriptMon, Aug 26, 2:55 PM
Yann added a comment.Mon, Aug 26, 2:57 PM

Not necessarily, just doesn't happen everytime :). Were those files uploaded via upload_by_url? If so, was it using standard iface at Special:Upload, or some gadget/userscript (one of the files mentions https://commons.wikimedia.org/wiki/User:Rillke/bigChunkedUpload.js).

These were uploaded with Rillke's script.

Yann added a comment.Mon, Aug 26, 4:04 PM

Very well, thanks a lot!

Yann added a comment.Fri, Aug 30, 4:35 PM

Same issue again with the PDF file from https://archive.org/details/20190830_20190830_1349
It is a big file, and I get the same error as before:

Our servers are currently under maintenance or experiencing a technical problem. Please try again in a few minutes.
See the error message at the bottom of this page for more information.
Request from 27.57.200.238 via cp1085 cp1085, Varnish XID 326179637
Error: 503, Backend fetch failed at Fri, 30 Aug 2019 16:32:12 GMT

This is to be uploaded over https://commons.wikimedia.org/wiki/File:%E0%A4%B6%E0%A4%BF%E0%A4%B2%E0%A5%8D%E0%A4%AA%E0%A4%95%E0%A4%BE%E0%A4%B0_%E0%A4%9A%E0%A4%B0%E0%A4%BF%E0%A4%A4%E0%A5%8D%E0%A4%B0%E0%A4%95%E0%A5%8B%E0%A4%B6_%E0%A4%96%E0%A4%82%E0%A4%A1_%E0%A5%A8_%E2%80%93_%E0%A4%B8%E0%A4%BE%E0%A4%B9%E0%A4%BF%E0%A4%A4%E0%A5%8D%E0%A4%AF.pdf

Yann reopened this task as Open.Fri, Aug 30, 4:35 PM
Urbanecm closed this task as Resolved.Sun, Sep 1, 12:09 AM
[urbanecm@mwmaint1002 ~/upload]$ mwscript importImages.php --wiki=commonswiki --user=Yann --overwrite .
Importing Files

शिल्पकार चरित्रकोश खंड २ - साहित्य.pdf exists, overwritinfailed. (at recordUpload stage)

Found: 1
Failed: 1
[urbanecm@mwmaint1002 ~/upload]$

Good question is why it doesn't work (and why no "g" and space is there), but that's offtopic here. The issue looks to be known, tracked elsewhere, so closing as resolved with a bit of invalid, created T231737 and let's follow-up there. Offtopic here by now.