Page MenuHomePhabricator

POST requests to pl.wikibooks.org fail with 'Maximum execution time of 180 seconds exceeded' since 2019-10-24 ~11:30 UTC
Open, Needs TriagePublic

Description

It looks like since ~11:30 UTC today, post requests to pl.wikibooks.org fail due to execution timeouts

/srv/mediawiki/php-1.35.0-wmf.3/extensions/Scribunto/includes/engines/LuaCommon/UstringLibrary.php:592

script_filename = /srv/mediawiki/docroot/wikibooks.org/w/api.php
[0x00007f0dc381f190] call() /srv/mediawiki/php-1.35.0-wmf.3/extensions/Scribunto/includes/engines/LuaSandbox/Engine.php:314
[0x00007f0dc381f0e0] callFunction() /srv/mediawiki/php-1.35.0-wmf.3/extensions/Scribunto/includes/engines/LuaCommon/LuaCommon.php:292
[0x00007f0dc381f040] executeFunctionChunk() /srv/mediawiki/php-1.35.0-wmf.3/extensions/Scribunto/includes/engines/LuaCommon/LuaCommon.php:979
[0x00007f0dc381ef90] invoke() /srv/mediawiki/php-1.35.0-wmf.3/extensions/Scribunto/includes/common/Hooks.php:128
[0x00007f0dc381ed80] invokeHook() /srv/mediawiki/php-1.35.0-wmf.3/includes/parser/Parser.php:3668
[0x00007f0dc381ec20] callParserFunction() /srv/mediawiki/php-1.35.0-wmf.3/includes/parser/Parser.php:3373
[0x00007f0dc381e910] braceSubstitution() /srv/mediawiki/php-1.35.0-wmf.3/includes/parser/PPFrame_Hash.php:253
[0x00007f0dc381e730] expand() /srv/mediawiki/php-1.35.0-wmf.3/includes/parser/Parser.php:3290
[0x00007f0dc381e420] braceSubstitution() /srv/mediawiki/php-1.35.0-wmf.3/includes/parser/PPFrame_Hash.php:253
[0x00007f0dc381e240] expand() /srv/mediawiki/php-1.35.0-wmf.3/includes/parser/Parser.php:3549
[0x00007f0dc381df30] braceSubstitution() /srv/mediawiki/php-1.35.0-wmf.3/includes/parser/PPFrame_Hash.php:253
[0x00007f0dc381dd50] expand() /srv/mediawiki/php-1.35.0-wmf.3/includes/parser/Parser.php:3549
[0x00007f0dc381da40] braceSubstitution() /srv/mediawiki/php-1.35.0-wmf.3/includes/parser/PPFrame_Hash.php:253
[0x00007f0dc381d860] expand() /srv/mediawiki/php-1.35.0-wmf.3/includes/parser/Parser.php:3187
[0x00007f0dc381d790] replaceVariables() /srv/mediawiki/php-1.35.0-wmf.3/includes/parser/Parser.php:876
[0x00007f0dc381d6b0] preprocess() /srv/mediawiki/php-1.35.0-wmf.3/extensions/ParsoidBatchAPI/includes/ApiParsoidBatch.php:222
[0x00007f0dc381d5b0] preprocess() /srv/mediawiki/php-1.35.0-wmf.3/extensions/ParsoidBatchAPI/includes/ApiParsoidBatch.php:118
[0x00007f0dc381d380] execute() /srv/mediawiki/php-1.35.0-wmf.3/includes/api/ApiMain.php:1602
[0x00007f0dc381d300] executeAction() /srv/mediawiki/php-1.35.0-wmf.3/includes/api/ApiMain.php:538
[0x00007f0dc381d240] executeActionWithErrorHandling() /srv/mediawiki/php-1.35.0-wmf.3/includes/api/ApiMain.php:509

Event Timeline

jijiki created this task.Oct 24 2019, 12:22 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 24 2019, 12:22 PM
jijiki updated the task description. (Show Details)Oct 24 2019, 3:55 PM
Krinkle added a subscriber: Krinkle.EditedNov 7 2019, 8:15 PM

This is from an internal Parsoid request.

Unfortunately timeouts of this kind are quite common, usually prompted by a complex wiki page in the personal sandbox of a user or bot account, for which we tend to tolerate failure and timeout is pretty much the only option there. Although it would be good of course if parsing was more deterministic in terms of resource/time use so that we can consistently allow/disallow parsing of some pages and thus response with HTTP 4xx or otherwise refuse saving of edits for which we know we can't parse the result – right now these cause HTTP 5xx errors which count towards "production is unstable / auto-rollback recent deployments / alert SRE" if their count reaches a certain threshold.

(PS: Such prevention is likely close to impossible as more often than not, the expense parser feature is triggered from a template that is only expensive if used from a certain article which means the usage likely exists and was cheap at the time, but then changes when editing the template, where the edit itself isn't slow. That still seems like an oppertunity for failing in a more graceful way, e.g. showing a placeholder like "This page is too complicated to parge." in a way that is cacheable and not HTTP 5xx, but that's a larger issue.)

I suppose for initial investigation we'll want to understand what causes these plwikibooks pages to timeout and determine whether that's a regression and cause for concern, or whether it's the kind of thing we're comfortable not supporting.