The ResourceLoader startup module specifies the current versions of all source code modules in production. It does so by iterating the registered modules and calling getVersionHash.
For the VisualEditorDataModule class, this is implemented by producing the actual the script (getScript), and hashing its output. If I recall correctly, I originally introduced that approach as a way of not needing to keeping track of all possible ways in which interface messages can change.
Especially because in VisualEditor in particular, there are parsed interface messages that allow templates. Which means tracking the message itself is usually not sufficient, it also needs to track when any of the templates used within are changed in anyway.
When I first introduced this, the messages were relatively simple, but it seems at least on a few major wikis the messages have reached significant complexity that the overall startup time for ResourceLoader spends the majority of its run-time continuously parsing the same interface messages over and over again to see if it has changed.
It has essentially changed the load.php?modules=startup request from a cheap check we do every 5 minutes, into a continuous uncached wikitext parser endpoint (like action=parse) that a lots (wikis * languages * skins) of concurrent polling loops going on.
This currently has no parsing because in this particular context the Parser is called from the message interface, which doesn't have ParserOutput cache, due to the need to support message parameters and such.
We need to figure out how to make this more performant. I suggest evaluating one of the following options:
- Try to change Message::parse() to instead use WikiPage/ParserOutput in some way, thus naturally levering ParserCache. This would essentially require the message to not use $N parameters, and use things like PAGENAME. This would give it the same performance as when viewing the interface message as a wiki page through MediaWiki regularly, with automatic cascading updates from any templates (via page_touched, and the parser cache) all automatically. Without additional infrastructure. We already don't support PAGENAME for ResourceLoader messages, so that should be fine.
- Downside: Cannot have $N parameters. (TODO: Is that a problem? Do we use them here?)
- Downside: Might have slightly different semantics between WikiPage::parse and Message::parse. should be feasible given we already don't support PAGENAME for these messages. The PAGENAME will always be "GlobalTitleFail".
- Try wrapping the Message::parse() call in a local cache of some kind (e.g. Memc or APC).
- Downside: Requires to figure out a way to purge the cache when needed, or requiring figuring out which case key would be safe enough to not need purging. Maybe we can use page_touched. That means we would still call Message::parse and preserve semantics that, but still have a cache that is similar to parser cache, but separate.