Error Message
Fatal error: request has exceeded memory limit in /srv/mediawiki/php-1.31.0-wmf.21/includes/parser/StripState.php on line 137
Stack Trace
N/A
Notes
This is occurring with a high frequency in production.
Fatal error: request has exceeded memory limit in /srv/mediawiki/php-1.31.0-wmf.21/includes/parser/StripState.php on line 137
N/A
This is occurring with a high frequency in production.
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
Limit total expansion size in StripState and improve limit handling | mediawiki/core | master | +223 -144 |
Started abruptly at 2018-02-20T15:09:19 - https://logstash.wikimedia.org/goto/120f931c22267c8c77415143ec565877
First impression: somebody is adding very large strip markers, which are tripping the memory limits when unstripped to add to the output.
Could be related to 939faea318d9c2107fab3a584bc1c023f3c592e9 which changed the regexp for strip markers. That should have made the match more restrictive, though. But that's precisely the regexp involved in this callback, so it's a pretty strong coincidence.
I think tying this to the initiating web request would be helpful. As it stands, the logs do not tell us what is causing this. Is it possible to find out a more detailed error log that identifies the web request url?
A stack trace would also be helpful. Or at least the value of $type or $text in the unstripType caller.
Also possible that someone is actually trying to do the sort of XSS attack that bawolff was protecting against? Deliberately creating a very long strip marker?
EDIT (from bawolff): ...or a DOS attempt.
Its also possible that the real issue is somewhere else (during parsing), and stripstate handling is the straw that pushes the memory camel over the edge
It's not a DOS, there's not enough requests. It's just a broken bot hitting the same old revision via action=parse again and again. Specifically 256170852, a revision of [[Barack Obama]] from 2008. The solution is to find the person running this bot and to tell them to stop doing it.
Looking at the diff of that revision, {{unblock|{{unblock|{{unblock|{{unblock|{{unblock|{{unblock|{{unblock|{{unblock|{{unblock|{{unblock|{{unblock|.}}}}}}}}}}}}}}}}}}}}} is the only thing out of the ordinary, so probably the trigger.
Turns out I wasn't look in the right place. fatal.log on mwlog1001 has the traces and Krinkle provided https://logstash.wikimedia.org/goto/9f522ce47239a4373bc059024f286f5e for identifying the request url.
Huh ..https://en.wikipedia.org/w/api.php?oldid=821745844 as well as https://en.wikipedia.org//w/api.php?action=parse&prop=iwlinks|externallinks&format=json&oldid=821745844 render just fine.
https://logstash.wikimedia.org/goto/8c64c21a6fa00945d06a0f07e3ad138f is the logstack url for the full traces in case others see something that I am missing there.
If it is a bot that is requesting external links on a page, could that be a bot that updates links on a page to IA links for dead links? While there is probably an issue here in the parser (to be determined), the bot seems to be repeatedly retrying the same failing request.
Looks like that is a very similar revision to the one that started this (which Tim had rev-del'ed).
Diff: https://en.wikipedia.org/?diff=255139603&diffonly=1
Quick guide to Rosetta and its graphics - + {{unblock|{{unblock|{{unblock|{{unblock|{{unblock|{{unblock|{{unblock|{{unblock|{{unblock|{{unblock|{{unblock|.}}}}}}}}}}}}}}}}}}}}} Like all BOINC projects, Rosetta@home runs in the background of the user's computer using idle computer power,
The same vandal made a few more similar edits in 2008, per https://en.wikipedia.org/wiki/Special:Contributions/Endingkey.
Replying to some IRC discussion
<subbu> TimStarling, doing it once, twice, thrice .. and looking at the expandtemplates output shows that the size of the output starts blowing up rapidly.
<subbu> roughly 4^(nesting level-1) * 4k
I see that in my own testing, but the StripState size (var_dumped) isn't increasing quite as fast.
<TimStarling> the post-expand include size is meant to limit massive preprocessor expansions, but I guess strip marker expansion is not included
Yeah, it seems to be counting the length of the strip marker rather than the length of the text inside the marker.
<Krinkle> TimStarling: Hm.. do we use strip markers for template parameters like {{{1}}} etc.? I thought maybe the expansion is big because of the recursion there.
<Krinkle> BUt afaik we don't use stripmarkers for those
We don't. But this template does {{#tag:nowiki|{{{1}}}}} three times, and each one of those creates a strip marker. What it probably really wants is a way to get the unparsed wikitext of {{{1}}} to pass to #tag:nowiki, but that's not available so it does the next best thing (and which will generally work for normal uses of the template).
In each case it's only the directly produced bytes that get counted against the post-expand and argument size limits, while the real size explosion is in the three strip markers that each have three strip markers inside, recursively.
Fix uploaded at https://gerrit.wikimedia.org/r/#/c/415207/ . Any objections to removing the policy? This is not a DOS attack. It's apparently not even a DOS attack vector, since the memory limit is doing its job, and the CPU time is short.
Change 415207 merged by jenkins-bot:
[mediawiki/core@master] Limit total expansion size in StripState and improve limit handling