Page MenuHomePhabricator

Investigate high revision text fetching memcached traffic for key modules
Open, Needs TriagePublic

Description

Module:Yesno and Module:Category_handler on enwiki are having their text fetched hundreds (e.g. ~250) of times per second from memcached. Many of these come from apaches (not just jobrunners).

Typical trace:

10:14:46 PM
 ori#0  MemcachedPeclBagOStuff->get(enwiki:revisiontext:textid:664533748) called at [/srv/mediawiki/php-1.26wmf7/includes/Revision.php:1515]
10:14:46 PM
 ori#1  Revision->loadText() called at [/srv/mediawiki/php-1.26wmf7/includes/Revision.php:1071]
10:14:46 PM
 ori#2  Revision->getContentInternal() called at [/srv/mediawiki/php-1.26wmf7/includes/Revision.php:1027]
10:14:48 PM
 ori#3  Revision->getContent() called at [/srv/mediawiki/php-1.26wmf7/includes/parser/Parser.php:3985]
10:14:50 PM
 ori#4  Parser::statelessFetchTemplate(Module:Yesno, Object of class Parser could not be converted to string) called at [/srv/mediawiki/php-1.26wmf7/includes/parser/Parser.php:3901]
10:14:52 PM
 ori#5  Parser->fetchTemplateAndTitle(Module:Yesno) called at [/srv/mediawiki/php-1.26wmf7/extensions/Scribunto/common/Base.php:153]
10:14:54 PM
 ori#6  ScribuntoEngineBase->fetchModuleFromParser(Module:Yesno) called at [/srv/mediawiki/php-1.26wmf7/extensions/Scribunto/engines/LuaCommon/LuaCommon.php:509]

We should figure out:
a) Why this happens (is it low parser output TTLs due to some bug or magic word)
b) If it is normal, whether to find ways to reduce traffic (e.g. APC). We should be careful since APC does not have smart eviction though.

Event Timeline

aaron raised the priority of this task from to Needs Triage.
aaron updated the task description. (Show Details)
aaron subscribed.
aaron set Security to None.
aaron added a subscriber: ori.

Well, (a) could probably be answered by logging the title of the parsed page, i.e. Parser::getTitle().

As for (b), for pages that use those modules, it is normal for there to be one such request per Parser::parse() call. If there is more than one, it could indicate some edge case, like message parsing or an extension cloning the Parser.

Krinkle moved this task from Backlog to Tag on the Performance Issue board.

If there is more than one, it could indicate some edge case, like message parsing or an extension cloning the Parser.

I note that https://en.wikipedia.org/w/api.php?action=query&titles=Module:Yesno&prop=transcludedin&tinamespace=8&tilimit=max&continue= reports a fair number of MediaWiki-namespace pages using the module.