Page MenuHomePhabricator

Investigate performance impact of HookContainer loading 500+ interfaces
Open, MediumPublic

Description

With buster, file loading/stat-ing syscalls are slower, so AutoLoader is where a lot of regression is showing up (see detailed explanation in T273312).

We should look into the impact caused by HookRunner loading 500+ interfaces and if there are any straightforward ways to migitate/reduce that (e.g. putting all the interfaces in one/two files)

Notes from IRC:

21:51:59 <legoktm> I started looking into autoloader perf, and it seems like most of the classes we're loading in a simple load.php request are just the hook interfaces: https://phabricator.wikimedia.org/P14211 (74%)
21:55:29 <legoktm> this was on mostly vanilla MW, but still 500+ interfaces does seem like a lot
22:40:31 <ori> https://github.com/wikimedia/mediawiki/blob/master/includes/HookContainer/HookRunner.php#L29-L571 huh
22:50:33 <ori> core hooks are split between an ApiHookRunner for API hooks and a HookRunner for everything else
22:52:23 <ori> is there anything to gain by breaking up HookRunner into new groupings?
22:58:40 <legoktm> ori: maybe, I was thinking of a "SetupHookRunner" for early stuff like the hook in MediaWikiServices, SetupAfterCache, WebRequest::getIP, but stopped looking after seeing it go into LanguageFactory/Language
00:00:00 - {Day changed to Friday, February 5, 2021}
03:17:49 <TimStarling> I think part of the rationale for the whole project was that we don't care about non-opcache installs
03:18:07 <TimStarling> if it's slow with opcache enabled then that is a concern
03:19:11 <TimStarling> there's things we could do to optimise HookRunner but it would be fairly ugly

Event Timeline

Something I thought of is the following:

  1. Introduce a new abstract class HookRunnerBase which implements all of the Hook interfaces. Because it's abstract, do not provide any implementations in that class (leave it empty)
  2. Have HookRunner extend HookRunnerBase (and not directly implement any interfaces)

This extra layer of indirection changes nothing by default; HookRunner still needs to properly implement the interfaces to avoid errors, we get doc comments in IDEs that support them, etc.

HOWEVER a production server can swap out the definition of HookRunnerBase with an empty abstract class that doesn't implement any interfaces (or implements only the subset of hook interfaces that are in widespread use, or whatever). Consider the following snippet of code that could be placed into LocalSettings.php:

abstract class HookRunnerBase { }

Because the class is now defined in LocalSettings.php, the autoloader will not attempt to fetch the regular HookRunnerBase (that has all the interface implementations). The only thing that breaks would be attempting to cast HookRunner into one of the hook interfaces. As far as I can tell, we don't do that so it should be safe.

Of course, asking people to define arbitrary classes in LocalSettings is probably not a good idea, so we can instead introduce some new configuration variable for it, such as $wgDebugHookRunnerInterfaces. This would default to true in DefaultSettings.php. Setting it to false in LocalSettings.php could trigger the following code added to Setup.php (immediately after all extensions are loaded):

if ( !$wgDebugHookRunnerInterfaces ) {
    abstract class HookRunnerBase { }
}

Achieving the same goal, but via a configuration variable. I'm working on a test patch for this so we can benchmark before/after to see if loading these interfaces has any noticeable perf impact.

The above doesn't actually work; while we aren't casting to Hook interfaces anywhere, a handful of constructors take the hook interface as strongly-typed arguments (e.g. WikiPageFactory). This causes fatals when HookRunner doesn't implement the hook interface classes, and I do not believe giving up type safety in that regard is worthwhile.

The above doesn't actually work; while we aren't casting to Hook interfaces anywhere, a handful of constructors take the hook interface as strongly-typed arguments (e.g. WikiPageFactory). This causes fatals when HookRunner doesn't implement the hook interface classes, and I do not believe giving up type safety in that regard is worthwhile.

Based on https://codesearch.wmcloud.org/core/?q=use%20MediaWiki(.*%3F)%3FHook%3B&i=nope&files=&excludeFiles=&repos= it might be that WikiPageFactory is the only place doing this? (PageEditStash actually type hints against HookContainer despite the doc comments indicating otherwise).

I'm not sure what extra safety you get by typing against the Hook interface rather than the Container/Runner besides you could theoretically pass in a standalone implementation of the Hook rather than needing a proper Container. In any case, given that it's only used in one place in core, I think it should be straightforward to comment out, at least for the purpose of this investigation.

Change 662777 had a related patch set uploaded (by Skizzerz; owner: Skizzerz):
[mediawiki/core@master] [DNM] HookRunner perf improvements

https://gerrit.wikimedia.org/r/662777

holger.knust added a project: Performance-Team.
holger.knust moved this task from Inbox to Tech Planning Review on the Platform Engineering board.

Change 663069 had a related patch set uploaded (by Legoktm; owner: Skizzerz):
[mediawiki/core@wmf/1.36.0-wmf.30] [DNM] HookRunner perf improvements

https://gerrit.wikimedia.org/r/663069

I cherry-picked the patch and manually deployed it to mwdebug1003. The request I tested was https://kk.wikipedia.org/w/load.php?debug=false&lang=kk&modules=ext.3d.styles%7Cext.cite.styles%7Cext.uls.interlanguage%7Cext.visualEditor.desktopArticleTarget.noscript%7Cext.wikimediaBadges%7Cmediawiki.legacy.commonPrint%2Cshared%7Cmediawiki.page.gallery.styles%7Cmediawiki.skinning.interface%7Cmediawiki.toc.styles%7Cskins.vector.styles%7Cwikibase.client.init&only=styles&skin=vector (I moved kkwiki to wmf.30 to avoid having to backport the patch to wmf.27)

My first runs (I accidentally cut off part of the URL in these):

This brings the Autoload::autoload() for load.php down from 13067ms to 4323ms... overall request went from 56,997ms to 39,903ms so some other noise probably but that seems promising.

I re-ran it with the full URL:

functionwith hooksno interfaces
Autoloader::autoload274 calls, 6,144ms274 calls, 4,345ms
Autoloader::autoload@1627 calls, 6,239ms102 calls, 1,148ms
Autoloader::autoload@29 calls, 120ms9 calls, 111ms
Autoloader::autoload@34 calls, 40ms4 calls, 30ms

Next I'm going to depool a buster appserver, and run our more sophisticated/thorough benchmark script against wmf.30 with the hooks, and then again with no interfaces to get better and more accurate numbers.

Mentioned in SAL (#wikimedia-operations) [2021-02-10T08:41:37Z] <legoktm> depooling mw1404.eqiad.wmnet for perf benchmarking (T274041)

It's not really essential for HookRunner to implement any interfaces. I think the simplest solution is to remove all the interfaces from HookRunner. Duplicate the doc comments, and have a test that ensures that the HookRunner method signatures match the interface signatures. @Pchelolo suggests generating HookRunner.php the way we generate autoload.php, which I guess would be nice to have. The interfaces will still exist for the convenience of extension handlers, but they will only need to be loaded when the handlers are actually called.

Part of the rationale for HookRunner implementing interfaces was that PHPStorm was going to show links to the interface declarations. But it usually doesn't -- PHPStorm limits the number of interfaces it will load for a given class, and we're well over that limit. So duplicating the doc comments is the best way to show documentation for developers writing hook callers in PHPStorm.

Next I'm going to depool a buster appserver, and run our more sophisticated/thorough benchmark script against wmf.30 with the hooks, and then again with no interfaces to get better and more accurate numbers.

Data and images available from https://people.wikimedia.org/~legoktm/T274041/ - IMO they're not really worth digging into more because unsurprisingly it's faster when you don't load 500+ more files.

For the most part this showed that this disproportionately affects load.php, probably because ResourceLoader usually doesn't load as many PHP classes as index.php and api.php do. And I suppose those hook interfaces will get loaded eventually when the first hook from an extension is called, but load.php doesn't call anywhere as many hooks.

... The interfaces will still exist for the convenience of extension handlers, but they will only need to be loaded when the handlers are actually called.

I worry that some mega-extensions like CentralAuth or Wikibase could end up loading a lot of interfaces whenever the first hook needs to be called. A convention of grouping related handlers (all API, all OutputPage, etc.) might mitigate that.

Gilles edited projects, added Performance-Team (Radar); removed Performance-Team.
Gilles added a subscriber: Gilles.

Change 663069 abandoned by Legoktm:
[mediawiki/core@wmf/1.36.0-wmf.30] [DNM] HookRunner perf improvements

Reason:

https://gerrit.wikimedia.org/r/663069