Page MenuHomePhabricator

VisualEditorDataModule spends significant time in hot startup module
Closed, ResolvedPublic


The ResourceLoader startup module specifies the current versions of all source code modules in production. It does so by iterating the registered modules and calling getVersionHash.

For the VisualEditorDataModule class, this is implemented by producing the actual the script (getScript), and hashing its output. If I recall correctly, I originally introduced that approach as a way of not needing to keeping track of all possible ways in which interface messages can change.

Especially because in VisualEditor in particular, there are parsed interface messages that allow templates. Which means tracking the message itself is usually not sufficient, it also needs to track when any of the templates used within are changed in anyway.

When I first introduced this, the messages were relatively simple, but it seems at least on a few major wikis the messages have reached significant complexity that the overall startup time for ResourceLoader spends the majority of its run-time continuously parsing the same interface messages over and over again to see if it has changed.

It has essentially changed the load.php?modules=startup request from a cheap check we do every 5 minutes, into a continuous uncached wikitext parser endpoint (like action=parse) that a lots (wikis * languages * skins) of concurrent polling loops going on.

Screen Shot 2018-03-08 at 10.27.20.png (1×2 px, 469 KB)

This currently has no parsing because in this particular context the Parser is called from the message interface, which doesn't have ParserOutput cache, due to the need to support message parameters and such.

We need to figure out how to make this more performant. I suggest evaluating one of the following options:

  1. Try to change Message::parse() to instead use WikiPage/ParserOutput in some way, thus naturally levering ParserCache. This would essentially require the message to not use $N parameters, and use things like PAGENAME. This would give it the same performance as when viewing the interface message as a wiki page through MediaWiki regularly, with automatic cascading updates from any templates (via page_touched, and the parser cache) all automatically. Without additional infrastructure. We already don't support PAGENAME for ResourceLoader messages, so that should be fine.
    • Downside: Cannot have $N parameters. (TODO: Is that a problem? Do we use them here?)
    • Downside: Might have slightly different semantics between WikiPage::parse and Message::parse. should be feasible given we already don't support PAGENAME for these messages. The PAGENAME will always be "GlobalTitleFail".
  2. Try wrapping the Message::parse() call in a local cache of some kind (e.g. Memc or APC).
    • Downside: Requires to figure out a way to purge the cache when needed, or requiring figuring out which case key would be safe enough to not need purging. Maybe we can use page_touched. That means we would still call Message::parse and preserve semantics that, but still have a cache that is similar to parser cache, but separate.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Krinkle triaged this task as High priority.Jul 9 2018, 9:14 PM


Debug profile shows it spending 60ms here in one of my samples (tested on commonswiki and enwiki, mostly warmed cache):

Production data shows it being generated about 8 times per minute, and every minute some of those take 120ms or more (p95):
2 year history:

Aggregate shows it as 10% of StartupModule::getScript (229 samples/2181 samples), or 2% of load.php wall time overall: (ctrl-F VisualEditor, or click on "StartupModule::getScript")

This is better than it used to be (used to be about 30% if I recall correctly). Possibly due to change 419956 which removed two of the uncached wikitext-parsing messages.

From a quick look, it seems no message parameters are used here, so option 1 might be viable.

Change 752237 had a related patch set uploaded (by Bartosz Dziewoński; author: Bartosz Dziewoński):

[mediawiki/core@master] EditPage: Allow getting copyright warning message without parsing it

Change 752240 had a related patch set uploaded (by Bartosz Dziewoński; author: Bartosz Dziewoński):

[mediawiki/extensions/VisualEditor@master] VisualEditorDataModule: Generate definition summary without parsing wikitext

Change 752237 merged by jenkins-bot:

[mediawiki/core@master] EditPage: Allow passing context to getCopyrightWarning()

Change 752240 merged by jenkins-bot:

[mediawiki/extensions/VisualEditor@master] VisualEditorDataModule: Generate definition summary without parsing wikitext