Frequently, the tool has been responding with "An error occurred: Server error: POST https://commons.wikimedia.org/w/api.php resulted in a 504 Gateway Timeout response: upstream request timeout" error. The 'Job queue' shows many failed attempts at uploading files.
Description
Related Objects
- Mentioned In
- T264228: Unable to upload Laon and Cythna
T286701: IA-Upload: retry failed commons uploads (504 errors) - Mentioned Here
- T293435: Pywikibot: copy upload calls hang (ish) rather than time out
T295009: Improve download speed from archive.org on appservers
T292954: [epic] large file uploads to commons
T282633: Investigate recent increase in downtime (May 2021) [8 HR]
T268594: An error occurred: Server error:
T276222: Error: 504 Gateway Timeout
T129216: Pywikibot should support async chunked uploading
Event Timeline
I'm also getting these over the API (without IA-Upload) - I think it's that the Commons upload-by-URL call times out.
However, the process is actually still working in the background and the file does eventually appear, but you still get 504 errors even once the file exists.
Based on the symptom and the time this started to become a real issue, I think this could well be the same root cause as T129216: the chunked upload needs to be async, or it times out.
If that is true, porting https://gerrit.wikimedia.org/r/c/pywikibot/core/+/679021 to addwiki would be needed.
For the time being, I've got a cron job running that restarts the webservice every 4 hours. This is not a fix.
It looks like the downtimes began around April 12, and commit 1ba22eb9083f53c1118175648941a702e80b2a15 was the first major commit recently before that, on March 29.
The changes between then and the most recent non-i18n commit: https://github.com/wikisource/ia-upload/compare/1ba22eb9083f53c1118175648941a702e80b2a15..d1020ef5c61a0cb90052acf9b18897d6554986d9
I think there is confusion here (at least I am confused). I thought this was about the IA UPload tool receiving a 504 from the Commons API and telling the user during the upload process (my pet theory is that this is because AddWiki doesn't do async chunked uploads and recently sync chunked uploads have stopped working well).
The more recent total IA-Upload outage(s) have been the IA-Upload tool itself returning 503s to the user.
Maybe the recent renaming of the task and rewording of the description hasn't helped?
Original title: IA Upload Gateway Timeouts.
Earlier title: IA Upload 504 Gateway Timeouts
Original description: For three days now, the tool has been responding with "An error occurred: Server error: POST https://commons.wikimedia.org/w/api.php resulted in a 504 Gateway Timeout response: upstream request timeout" error. The 'Job queue' shows many failed attempts at uploading files.
Maybe the recent renaming of the task and rewording of the description hasn't helped?
Sorry! I must have misread. Inductiveload's comments from April lined up with the general downtime we were seeing, and 503/504 are pretty close numerically, heh... We can re-open T268594: An error occurred: Server error: or T276222: Error: 504 Gateway Timeout about that issue since I incorrectly hijacked this task? Apologies again!
Yeah, sorry – I got confused too!
I've opened T282633 to deal with the current downtime problem.
Is this perhaps simply another symptom of T292954?
@Pigsonthewing Are you still getting these errors? The changes announced in the latest Tech News should have essentially eliminated these if it's the same cause.
@Xover I haven't seen one of these for a while, but they're theoretically still possible until T295009 moves.
However, they're often "bogus" in that you get a timeout, but the file does eventually appear on the server (the first you know of that is often an exists error on a subsequent re-try (à la T293435)