Page MenuHomePhabricator

Specific fr.wp page is very slow to parse
Closed, ResolvedPublicSecurity

Description

This fr.wp page:

https://fr.wikipedia.org/?curid=7544505
https://fr.wikipedia.org/wiki/Discussion_Wikipédia:Comité_d'arbitrage/Arbitrage/Touriste-Patrick_Rogel

Takes 60+ seconds to parse, both in Parsoid and the old parser. It looks like a large-ish but normal discussion page. The problem seems to be in preprocessor, because I tried to paste it into Special:ExpandTemplates, and it gave me a "Wikimedia error".

Event Timeline

I reproduced it locally, here's a version that reproduced the problem with English content language:

[[User:Matma Rex|Matma Rex]] <sup>[[User talk:Matma Rex|talk]]</sup> <br /><small> <span style="color:#333333">20:58, 24 November 2013 (UTC) <br />Mise jour : 12:58, 29 November 2013 (UTC)</span></small>

I also got an exception ID. It's caused by DiscussionTools:

[35da1713-7410-4c0b-adf0-1123048f70cc] /w/api.php?action=parse&format=json&formatversion=2&text=%5B%5Butilisateur%3Aeuphonie%7Ceuphonie%5D%5D%20%3Csup%3E%5B%5Bdiscussion%20utilisateur%3Aeuphonie%7Cbr%C3%A9viaire%5D%5D%3C%2Fsup%3E%20%3Cbr%20%2F%3E%3Csmall%3E%20%3Cspan%20style%3D%22color%3A%23333333%22%3E24%20novembre%202013%20%C3%A0%2020%3A58%20(CET)%20%3Cbr%20%2F%3EMise%20%C3%A0%20jour%20%3A%2029%20novembre%202013%20%C3%A0%2012%3A58%20(CET)%3C%2Fspan%3E%3C%2Fsmall%3E%0A&title=Talk%3AFoo   Wikimedia\RequestTimeout\RequestTimeoutException: The maximum execution time of 60 seconds was exceeded
from /srv/mediawiki/php-1.42.0-wmf.16/vendor/wikimedia/request-timeout/src/Detail/ExcimerTimerWrapper.php(97)
#0 /srv/mediawiki/php-1.42.0-wmf.16/vendor/wikimedia/request-timeout/src/Detail/ExcimerTimerWrapper.php(72): Wikimedia\RequestTimeout\Detail\ExcimerTimerWrapper->onTimeout(integer)
#1 /srv/mediawiki/php-1.42.0-wmf.16/extensions/DiscussionTools/includes/CommentUtils.php(560): Wikimedia\RequestTimeout\Detail\ExcimerTimerWrapper->Wikimedia\RequestTimeout\Detail\{closure}(integer)
#2 /srv/mediawiki/php-1.42.0-wmf.16/extensions/DiscussionTools/includes/CommentParser.php(690): MediaWiki\Extension\DiscussionTools\CommentUtils::linearWalkBackwards(Wikimedia\Parsoid\DOM\Text, Closure)
#3 /srv/mediawiki/php-1.42.0-wmf.16/extensions/DiscussionTools/includes/CommentParser.php(973): MediaWiki\Extension\DiscussionTools\CommentParser->findSignature(Wikimedia\Parsoid\DOM\Text, Wikimedia\Parsoid\DOM\Text)
#4 /srv/mediawiki/php-1.42.0-wmf.16/extensions/DiscussionTools/includes/CommentUtils.php(540): MediaWiki\Extension\DiscussionTools\CommentParser->MediaWiki\Extension\DiscussionTools\{closure}(string, Wikimedia\Parsoid\DOM\Text)
#5 /srv/mediawiki/php-1.42.0-wmf.16/extensions/DiscussionTools/includes/CommentParser.php(983): MediaWiki\Extension\DiscussionTools\CommentUtils::linearWalk(Wikimedia\Parsoid\DOM\Element, Closure)
#6 /srv/mediawiki/php-1.42.0-wmf.16/extensions/DiscussionTools/includes/CommentParser.php(98): MediaWiki\Extension\DiscussionTools\CommentParser->buildThreadItems()
#7 /srv/mediawiki/php-1.42.0-wmf.16/extensions/DiscussionTools/includes/CommentFormatter.php(285): MediaWiki\Extension\DiscussionTools\CommentParser->parse(Wikimedia\Parsoid\DOM\Element, MediaWiki\Title\TitleValue)
#8 /srv/mediawiki/php-1.42.0-wmf.16/extensions/DiscussionTools/includes/CommentFormatter.php(67): MediaWiki\Extension\DiscussionTools\CommentFormatter::addDiscussionToolsInternal(string, MediaWiki\Parser\ParserOutput, MediaWiki\Title\Title)
#9 /srv/mediawiki/php-1.42.0-wmf.16/extensions/DiscussionTools/includes/Hooks/ParserHooks.php(69): MediaWiki\Extension\DiscussionTools\CommentFormatter::addDiscussionTools(string, MediaWiki\Parser\ParserOutput, MediaWiki\Title\Title)
#10 /srv/mediawiki/php-1.42.0-wmf.16/extensions/DiscussionTools/includes/Hooks/ParserHooks.php(121): MediaWiki\Extension\DiscussionTools\Hooks\ParserHooks->transformHtml(MediaWiki\Parser\ParserOutput, string, MediaWiki\Title\Title, boolean)
#11 /srv/mediawiki/php-1.42.0-wmf.16/includes/HookContainer/HookContainer.php(159): MediaWiki\Extension\DiscussionTools\Hooks\ParserHooks->onParserAfterTidy(Parser, string)
#12 /srv/mediawiki/php-1.42.0-wmf.16/includes/HookContainer/HookRunner.php(2904): MediaWiki\HookContainer\HookContainer->run(string, array)
#13 /srv/mediawiki/php-1.42.0-wmf.16/includes/parser/Parser.php(1698): MediaWiki\HookContainer\HookRunner->onParserAfterTidy(Parser, string)
#14 /srv/mediawiki/php-1.42.0-wmf.16/includes/parser/Parser.php(658): Parser->internalParseHalfParsed(string, boolean, boolean)
#15 /srv/mediawiki/php-1.42.0-wmf.16/includes/content/WikitextContentHandler.php(397): Parser->parse(string, MediaWiki\Title\Title, ParserOptions, boolean, boolean, NULL)
#16 /srv/mediawiki/php-1.42.0-wmf.16/includes/content/ContentHandler.php(1656): WikitextContentHandler->fillParserOutput(WikitextContent, MediaWiki\Content\Renderer\ContentParseParams, MediaWiki\Parser\ParserOutput)
#17 /srv/mediawiki/php-1.42.0-wmf.16/includes/content/Renderer/ContentRenderer.php(47): ContentHandler->getParserOutput(WikitextContent, MediaWiki\Content\Renderer\ContentParseParams)
#18 /srv/mediawiki/php-1.42.0-wmf.16/includes/api/ApiParse.php(158): MediaWiki\Content\Renderer\ContentRenderer->getParserOutput(WikitextContent, MediaWiki\Title\Title, NULL, ParserOptions)
#19 /srv/mediawiki/php-1.42.0-wmf.16/includes/poolcounter/PoolCounterWorkViaCallback.php(73): ApiParse->{closure}()
#20 /srv/mediawiki/php-1.42.0-wmf.16/includes/poolcounter/PoolCounterWork.php(172): MediaWiki\PoolCounter\PoolCounterWorkViaCallback->doWork()
#21 /srv/mediawiki/php-1.42.0-wmf.16/includes/api/ApiParse.php(165): MediaWiki\PoolCounter\PoolCounterWork->execute()
#22 /srv/mediawiki/php-1.42.0-wmf.16/includes/api/ApiParse.php(441): ApiParse->getContentParserOutput(WikitextContent, MediaWiki\Title\Title, NULL, ParserOptions)
#23 /srv/mediawiki/php-1.42.0-wmf.16/includes/api/ApiMain.php(1942): ApiParse->execute()
#24 /srv/mediawiki/php-1.42.0-wmf.16/includes/api/ApiMain.php(917): ApiMain->executeAction()
#25 /srv/mediawiki/php-1.42.0-wmf.16/includes/api/ApiMain.php(888): ApiMain->executeActionWithErrorHandling()
#26 /srv/mediawiki/php-1.42.0-wmf.16/api.php(95): ApiMain->execute()
#27 /srv/mediawiki/php-1.42.0-wmf.16/api.php(48): wfApiMain()
#28 /srv/mediawiki/w/api.php(3): require(string)
#29 {main}

I found why it happens. In certain cases the parser could go back rather than forward after finding a signature, causing it to find the same signature forever until it ran out of memory. The necessary condition is two valid timestamps, in the same formatting element, separated by a block element, like this:

hi [[User:Matma Rex|Matma Rex]] <small>01:01, 7 February 2024 (UTC)<br />01:02, 7 February 2024 (UTC)</small>

It's a one-line fix:

I would like to backport this, and then I'll submit (as a public patch) a test case and some cleanup.

sbassett added subscribers: gerritbot, sbassett.

It's a one-line fix:

I would like to backport this, and then I'll submit (as a public patch) a test case and some cleanup.

CR+1, also fine with this being low-risk to just go through gerrit.

Change 998553 had a related patch set uploaded (by Bartosz Dziewoński; author: Bartosz Dziewoński):

[mediawiki/extensions/DiscussionTools@master] Parser: Fix the main loop getting stuck on some signatures

https://gerrit.wikimedia.org/r/998553

Change 998553 merged by jenkins-bot:

[mediawiki/extensions/DiscussionTools@master] Parser: Fix the main loop getting stuck on some signatures

https://gerrit.wikimedia.org/r/998553

Change 998453 had a related patch set uploaded (by Bartosz Dziewoński; author: Bartosz Dziewoński):

[mediawiki/extensions/DiscussionTools@wmf/1.42.0-wmf.16] Parser: Fix the main loop getting stuck on some signatures

https://gerrit.wikimedia.org/r/998453

Change 998454 had a related patch set uploaded (by Bartosz Dziewoński; author: Bartosz Dziewoński):

[mediawiki/extensions/DiscussionTools@wmf/1.42.0-wmf.17] Parser: Fix the main loop getting stuck on some signatures

https://gerrit.wikimedia.org/r/998454

Change 998454 merged by jenkins-bot:

[mediawiki/extensions/DiscussionTools@wmf/1.42.0-wmf.17] Parser: Fix the main loop getting stuck on some signatures

https://gerrit.wikimedia.org/r/998454

Change 998453 merged by jenkins-bot:

[mediawiki/extensions/DiscussionTools@wmf/1.42.0-wmf.16] Parser: Fix the main loop getting stuck on some signatures

https://gerrit.wikimedia.org/r/998453

Fixed in production. @sbassett Could you please make this task public? Thanks! (and thanks for fixing my gerritbot mistake, I added it in the wrong field)

Change 998993 had a related patch set uploaded (by Bartosz Dziewoński; author: Bartosz Dziewoński):

[mediawiki/extensions/DiscussionTools@master] Add test cases for the main loop getting stuck on some signatures

https://gerrit.wikimedia.org/r/998993

sbassett triaged this task as Medium priority.
sbassett changed Author Affiliation from N/A to WMF Product.
sbassett changed the visibility from "Custom Policy" to "Public (No Login Required)".
sbassett changed the edit policy from "Custom Policy" to "All Users".
sbassett changed Risk Rating from N/A to Medium.

Change 998993 merged by jenkins-bot:

[mediawiki/extensions/DiscussionTools@master] Add test cases for the main loop getting stuck on some signatures

https://gerrit.wikimedia.org/r/998993