Page MenuHomePhabricator

Read timeout after marking pages for translation
Closed, ResolvedPublic

Description

I recently received several "Wikimedia Foundation" errors after carrying out translation admin actions. The action completed successfully in each case, but the page that is supposed to be loaded afterwards threw a read timeout. Listing the log entry and the corresponding error message for three cases:

20:52, 30 August 2013 Tbayer (WMF) (talk | contribs | block) marked Wikimedia Blog/Drafts/Education Program Arab World third term wrap-up for translation

https://meta.wikimedia.org/wiki/Special:PageTranslation :
Request: POST http://meta.wikimedia.org/wiki/Special:PageTranslation, from 10.64.0.127 via cp1016.eqiad.wmnet (squid/2.7.STABLE9) to 10.2.2.1 (10.2.2.1)
Error: ERR_READ_TIMEOUT, errno [No Error] at Fri, 30 Aug 2013 20:53:33 GMT

21:00, 30 August 2013 Tbayer (WMF) (talk | contribs | block) marked Wikimedia Blog/Drafts/Full Speed Ahead in the Arab World for translation

https://meta.wikimedia.org/wiki/Special:PageTranslation :
Request: POST http://meta.wikimedia.org/wiki/Special:PageTranslation, from 10.64.0.131 via cp1016.eqiad.wmnet (squid/2.7.STABLE9) to 10.2.2.1 (10.2.2.1)
Error: ERR_READ_TIMEOUT, errno [No Error] at Fri, 30 Aug 2013 21:01:29 GMT

00:29, 31 August 2013 Tbayer (WMF) (talk | contribs | block) sent a notification about translating page Wikimedia Highlights, July 2013; languages: all languages; deadline: none; priority: medium; sent to 1597 recipients, failed for 0 recipients, skipped for 19 recipients

https://meta.wikimedia.org/wiki/Special:NotifyTranslators :
Request: POST http://meta.wikimedia.org/wiki/Special:NotifyTranslators, from 208.80.154.77 via cp1013.eqiad.wmnet (squid/2.7.STABLE9) to 10.64.0.142 (10.64.0.142)
Error: ERR_READ_TIMEOUT, errno [No Error] at Sat, 31 Aug 2013 00:29:38 GMT

It's similar to https://bugzilla.wikimedia.org/show_bug.cgi?id=41131 ("Timeout when sending translation notification")
However, that bug was closed as fixed half a year ago, and indeed didn't occur for me since at least the beginning of this year, and also it did not happen when marking pages for translation. So it seems reasonable to assume that the reason is different now; hence filing it separately.


Version: unspecified
Severity: normal
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=53932
https://bugzilla.wikimedia.org/show_bug.cgi?id=55397
https://bugzilla.wikimedia.org/show_bug.cgi?id=41131

Details

Reference
bz53769

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 1:51 AM
bzimport set Reference to bz53769.
bzimport added a subscriber: Unknown Object (MLST).

Yeah, this seems similar to the CN bug I filed earlier (bug: 53674 ) where a save there times out constantly (or takes a long time and almost times out). Fundraising thinks it's because of a translation job running on save as well. The Mark Translation time outs were a nightmare while setting up the privacy policy discussion yesterday (and as we tweak it during the discussion now) and it's basically gotten to the point that we never don't time out from a markup (I think I've had once where we didn't of a good 30+ markups in the past 24 hours).

(In reply to comment #1)

Yeah, this seems similar to the CN bug I filed earlier (bug: 53674 ) where a
save there times out constantly (or takes a long time and almost times out).
Fundraising thinks it's because of a translation job running on save as well.

Bug 53674 has a patch now, probably worth to recheck once it's merged.

Seeking a bit of an update on this if possible. I believe something was said in a separate email thread about some possible ops needs for this (new translation servers or something like that) but as I was not on that thread not sure exactly what the situation is.

Patch in bug 53674 got merged on Friday and will likely be deployed today (Monday afternoon PST time) - retesting highly welcome after that.

(In reply to comment #4)

Patch in bug 53674 got merged on Friday and will likely be deployed today
(Monday afternoon PST time) - retesting highly welcome after that.

Aye, sounds like this got pushed (it's deployed but not enabled) that particular patch should not effect this bug however. It works by just making the CN save async with the translation task (and not fixing the speed of the translation task issue).

Change 83827 had a related patch set uploaded by Nikerabbit:
Add getKeys optimization to BannerMessageGroup

https://gerrit.wikimedia.org/r/83827

Change 83818 had a related patch set uploaded by Nikerabbit:
Always call getKeys for message groups if it exists

https://gerrit.wikimedia.org/r/83818

Change 83818 merged by Mwalker:
Always call getKeys for message groups if it exists

https://gerrit.wikimedia.org/r/83818

Change 83827 merged by Mwalker:
Add getKeys optimization to BannerMessageGroup

https://gerrit.wikimedia.org/r/83827

Change 83980 had a related patch set uploaded by Mwalker:
Clean up CentralNotice Translation Metadata

https://gerrit.wikimedia.org/r/83980

Thank you for filing this bug, Tilman. I've been hitting the same issue on Meta-Wiki.

I tried Special:NotifyTranslators and it was very quick, nowhere near timing out. If it does time out for you, can you please list the parameters you used in a separate bug, as that probably has different cause than when marking page for translation.

Change 83980 merged by Adamw:
Clean up CentralNotice Translation Metadata

https://gerrit.wikimedia.org/r/83980

Change 84676 had a related patch set uploaded by Mwalker:
Cache Banner Message Field Definitions

https://gerrit.wikimedia.org/r/84676

Change 84676 merged by Adamw:
Cache Banner Message Field Definitions

https://gerrit.wikimedia.org/r/84676

(In reply to comment #12)

I tried Special:NotifyTranslators and it was very quick, nowhere near timing
out. If it does time out for you, can you please list the parameters you used
in a separate bug, as that probably has different cause than when marking
page
for translation.

OK, filed as https://bugzilla.wikimedia.org/show_bug.cgi?id=55397 after this happened again.