Page MenuHomePhabricator

Preprocessor_Hash: Allowed memory size exhausted
Open, NormalPublic

Description

Error

Request ID: XUs9@QpAICoAAJNbKO4AAACP
Request URL: https://de.wikipedia.org/wiki/Bild_(Zeitung)

message
[XUs9@QpAICoAAJNbKO4AAACP] /wiki/Bild_(Zeitung)   PHP Fatal Error from line 126 of /srv/mediawiki/php-1.34.0-wmf.16/includes/parser/Preprocessor_Hash.php: Allowed memory size of 692060160 bytes exhausted (tried to allocate 20480 bytes)
stack
#1  json_decode() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/parser/Preprocessor_Hash.php:126]
#2  Preprocessor_Hash->preprocessToObj() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/parser/Parser.php:3063]
#3  Parser->preprocessToDom() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/parser/Parser.php:3673]
#4  Parser->getTemplateDom() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/parser/Parser.php:3426]
#5  Parser->braceSubstitution() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/parser/PPFrame_Hash.php:254]
#6  PPFrame_Hash->expand() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/parser/Parser.php:3229]
#7  Parser->braceSubstitution() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/parser/PPFrame_Hash.php:254]
#8  PPFrame_Hash->expand() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/parser/PPTemplateFrame_Hash.php:167]
#9  PPTemplateFrame_Hash->getNamedArgument() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/parser/PPTemplateFrame_Hash.php:179]
#10 PPTemplateFrame_Hash->getArgument() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/parser/PPTemplateFrame_Hash.php:111]
#11 PPTemplateFrame_Hash->getArguments() called at [/srv/mediawiki/php-1.34.0-wmf.16/extensions/Scribunto/includes/engines/LuaCommon/LuaCommon.php:693]
#12 Scribunto_LuaEngine->getAllExpandedArguments() called at [/srv/mediawiki/php-1.34.0-wmf.16/extensions/Scribunto/includes/engines/LuaSandbox/Engine.php:391]
#13 Scribunto_LuaSandboxCallback->__call()
#14 LuaSandboxFunction->call() called at [/srv/mediawiki/php-1.34.0-wmf.16/extensions/Scribunto/includes/engines/LuaSandbox/Engine.php:314]
#15 Scribunto_LuaSandboxInterpreter->callFunction() called at [/srv/mediawiki/php-1.34.0-wmf.16/extensions/Scribunto/includes/engines/LuaCommon/LuaCommon.php:296]
#16 Scribunto_LuaEngine->executeFunctionChunk() called at [/srv/mediawiki/php-1.34.0-wmf.16/extensions/Scribunto/includes/engines/LuaCommon/LuaCommon.php:983]
#17 Scribunto_LuaModule->invoke() called at [/srv/mediawiki/php-1.34.0-wmf.16/extensions/Scribunto/includes/common/Hooks.php:128]
#18 ScribuntoHooks::invokeHook() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/parser/Parser.php:3607]
#19 Parser->callParserFunction() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/parser/Parser.php:3312]
#20 Parser->braceSubstitution() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/parser/PPFrame_Hash.php:254]
#21 PPFrame_Hash->expand() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/parser/CoreParserFunctions.php:1080]
#22 CoreParserFunctions::tagObj() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/parser/Parser.php:3607]
#23 Parser->callParserFunction() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/parser/Parser.php:3312]
#24 Parser->braceSubstitution() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/parser/PPFrame_Hash.php:254]
#25 PPFrame_Hash->expand() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/parser/Parser.php:3488]
#26 Parser->braceSubstitution() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/parser/PPFrame_Hash.php:254]
#27 PPFrame_Hash->expand() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/parser/Parser.php:3488]
#28 Parser->braceSubstitution() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/parser/PPFrame_Hash.php:254]
#29 PPFrame_Hash->expand() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/parser/Parser.php:3126]
#30 Parser->replaceVariables() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/parser/Parser.php:1437]
#31 Parser->internalParse() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/parser/Parser.php:568]
#32 Parser->parse() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/content/WikitextContent.php:365]
#33 WikitextContent->fillParserOutput() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/content/AbstractContent.php:555]
#34 AbstractContent->getParserOutput() called at [/srv/mediawiki/php-1.34.0-wmf.16/extensions/FlaggedRevs/backend/FlaggedRevs.php:616]
#35 FlaggedRevs::parseStableRevision() called at [/srv/mediawiki/php-1.34.0-wmf.16/extensions/FlaggedRevs/backend/FlaggedRevs.php:534]
#36 FlaggedRevs::{closure}() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/poolcounter/PoolCounterWorkViaCallback.php:69]
#37 PoolCounterWorkViaCallback->doWork() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/poolcounter/PoolCounterWork.php:125]
#38 PoolCounterWork->execute() called at [/srv/mediawiki/php-1.34.0-wmf.16/extensions/FlaggedRevs/backend/FlaggedRevs.php:550]
#39 FlaggedRevs::parseStableRevisionPooled() called at [/srv/mediawiki/php-1.34.0-wmf.16/extensions/FlaggedRevs/frontend/FlaggablePageView.php:778]
#40 FlaggablePageView->showStableVersion() called at [/srv/mediawiki/php-1.34.0-wmf.16/extensions/FlaggedRevs/frontend/FlaggablePageView.php:401]
#41 FlaggablePageView->setPageContent() called at [/srv/mediawiki/php-1.34.0-wmf.16/extensions/FlaggedRevs/frontend/FlaggedRevsUIHooks.php:220]
#42 FlaggedRevsUIHooks::onArticleViewHeader() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/Hooks.php:174]
#43 Hooks::callHook() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/Hooks.php:202]
#44 Hooks::run() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/page/Article.php:678]
#45 Article->view() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/actions/ViewAction.php:63]
#46 ViewAction->show() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/MediaWiki.php:507]
#47 MediaWiki->performAction() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/MediaWiki.php:302]
#48 MediaWiki->performRequest() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/MediaWiki.php:892]
#49 MediaWiki->main() called at [/srv/mediawiki/php-1.34.0-wmf.16/includes/MediaWiki.php:523]
#50 MediaWiki->run() called at [/srv/mediawiki/php-1.34.0-wmf.16/index.php:42]
#51 require(/srv/mediawiki/php-1.34.0-wmf.16/index.php) called at [/srv/mediawiki/w/index.php:3]
Notes

92 errors since 2019-08-04. Appears to be happening in a variety of contexts, mostly on dewiki but a few of the early ones on enwiki.

Event Timeline

brennen created this task.Aug 7 2019, 10:00 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 7 2019, 10:00 PM
mobrovac triaged this task as Normal priority.Aug 8 2019, 9:28 AM
mobrovac edited subscribers, added: mobrovac; removed: GWicke, Aklapper.

This seems to be a wmf.16 regression (as this error is not appearing for earlier deploys).

Assuming this task is about MediaWiki-Parser, hence adding project tag so others can find this task when searching for tasks under that project or looking at that project workboard.

See T216664: MWException when viewing or comparing certain pages with Preprocessor_DOM (PHP7 beta feature) for a past issue.

mobrovac updated the task description. (Show Details)Aug 13 2019, 10:36 AM

Change 529971 had a related patch set uploaded (by Mobrovac; owner: Mobrovac):
[mediawiki/core@master] Parser hash preprocessor: Do not fail on json_decode() errors

https://gerrit.wikimedia.org/r/529971

I have looked into this, but couldn't find an exact root cause. To me this smells like a cache data corruption issue. Here's some data:

  • The problem occurs in json_decode() while decoding the cached stringified JSON.
  • The error appears sporadically until 2019-08-04, but intensifies afterwards.
  • Virtually all of the occurrences of this error are for dewiki pages about newspapers.
  • All of the pages include the Infobox Publikation template, which uses Lua.
  • All of the errors haven been produced on PHP 7 nodes.

Furthermore, I couldn't find any occurrence of memory exhaustion during the matching json_encode() part, which leads me to believe the data that is fed into json_decode() is erroneous. As I don't see any sensible way of reproducing this, I propose a patch above that logs some relevant info when the problem happens in the hopes that it could lead us to a diagnosis.

Pinging @aaron, @cscott and @Anomie as they might be able to help here.

Joe added a subscriber: Joe.Aug 14 2019, 9:56 AM

T214984 seems somewhat related.

jijiki added a subscriber: jijiki.Aug 14 2019, 10:05 AM

T214984 seems somewhat related.

Indeed it might be related. Since all of the manifestations of this bug occur on PHP7 and it happens sporadically (given that all affected pages display and the rate is very low), a plausible explanation here could be that the JSON string is encoded by HHVM, but then when PHP7 tries to decode it, it spirals into an infinite loop.

Change 529971 abandoned by Mobrovac:
Parser hash preprocessor: Do not fail on json_decode() errors

Reason:
As Anomie points out, memory exhaustion in an unrecoverable error in PHP7 as well, so this patch is pointless.

https://gerrit.wikimedia.org/r/529971

T214984 seems somewhat related.

Indeed it might be related. Since all of the manifestations of this bug occur on PHP7 and it happens sporadically (given that all affected pages display and the rate is very low), a plausible explanation here could be that the JSON string is encoded by HHVM, but then when PHP7 tries to decode it, it spirals into an infinite loop.

I can't say it's impossible, but it seems unlikely to me. A T214984 issue would more likely just cause the json_decode() to return null, not to use up all memory somehow. If there is a JSON string that makes PHP7 infinitely loop, we'd want to report that upstream as a security bug (DOS vector).

More likely, IMO, is that on HHVM it comes in just under the memory limit while in PHP7 it somehow uses a bit more memory and so runs out during the decode. It would be helpful if we could find a reliable reproduction case.

I can't say it's impossible, but it seems unlikely to me. A T214984 issue would more likely just cause the json_decode() to return null, not to use up all memory somehow. If there is a JSON string that makes PHP7 infinitely loop, we'd want to report that upstream as a security bug (DOS vector).
More likely, IMO, is that on HHVM it comes in just under the memory limit while in PHP7 it somehow uses a bit more memory and so runs out during the decode. It would be helpful if we could find a reliable reproduction case.

Hm, the text cut-off is at 1M, but the failure reports over 690M being already used. I'm finding it hard to believe that the memory footprint of the structure is over 690 times larger than the text it derives from.

Anomie added a comment.EditedAug 14 2019, 3:50 PM

Not all of the 690M used is due to the json_decode(), though. That 690M includes all the rest of MediaWiki and its data too, plus the string being decoded.

But also lots of little PHP objects/arrays can be surprisingly memory-hungry. See T55663 for a similar issue.

anomie@mwmaint1002:~$ php -r '$a = []; $n = 100000; for($i=0; $i<$n; $i++){ $a[] = "{\"foo\":$i}"; } $s = "[" . implode( ",", $a ) . "]"; unset( $a ); $m0 = memory_get_usage(); $b = json_decode( $s ); $m1 = memory_get_usage(); printf( "\nJSON string length:         %10d\nMemory usage before decode: %10d\nMemory usage after decode:  %10d\nMemory usage delta:         %10d\n\nExpansion factor: %g\nMemory used per object: ~%g\n", strlen( $s ), $m0, $m1, $m1 - $m0, ( $m1 - $m0 ) / strlen( $s ), ( $m1 - $m0 ) / $n );'

JSON string length:            1388891
Memory usage before decode:    1780304
Memory usage after decode:    51819168
Memory usage delta:           50038864

Expansion factor: 36.0279
Memory used per object: ~500.389

anomie@mwmaint1002:~$ php -r '$a = []; $n = 100000; for($i=0; $i<$n; $i++){ $a[] = "{\"foo\":$i}"; } $s = "[" . implode( ",", $a ) . "]"; unset( $a ); $m0 = memory_get_usage(); $b = json_decode( $s, true ); $m1 = memory_get_usage(); printf( "\nJSON string length:         %10d\nMemory usage before decode: %10d\nMemory usage after decode:  %10d\nMemory usage delta:         %10d\n\nExpansion factor: %g\nMemory used per array: ~%g\n", strlen( $s ), $m0, $m1, $m1 - $m0, ( $m1 - $m0 ) / strlen( $s ), ( $m1 - $m0 ) / $n );'

JSON string length:            1388891
Memory usage before decode:    1780688
Memory usage after decode:    46779168
Memory usage delta:           44998480

Expansion factor: 32.3989
Memory used per array: ~449.985

And for comparison, a similar array full of scalars:

anomie@mwmaint1002:~$ php -r '$a = []; $n = 100000; for($i=0; $i<$n; $i++){ $a[] = " \"foo\",$i "; } $s = "[" . implode( ",", $a ) . "]"; unset( $a ); $m0 = memory_get_usage(); $b = json_decode( $s, true ); $m1 = memory_get_usage(); printf( "\nJSON string length:         %10d\nMemory usage before decode: %10d\nMemory usage after decode:  %10d\nMemory usage delta:         %10d\n\nExpansion factor: %g\nMemory used per pair: ~%g\n", strlen( $s ), $m0, $m1, $m1 - $m0, ( $m1 - $m0 ) / strlen( $s ), ( $m1 - $m0 ) / $n );'

JSON string length:            1388891
Memory usage before decode:    1780688
Memory usage after decode:    13373472
Memory usage delta:           11592784

Expansion factor: 8.34679
Memory used per pair: ~115.928
mmodell changed the subtype of this task from "Task" to "Production Error".Wed, Aug 28, 11:06 PM