Page MenuHomePhabricator

3% of load.php spent creating DBLoadBalancerFactory in WebStart.php
Closed, ResolvedPublic

Description

Unconditionally, on all web requests (including api.php, rest.php, load.php, etc.) we create the DBLoadBalancerFactory service in order to inject some request-specific meta data:

includes/Setup.php
MediaWikiServices::getInstance()->getDBLoadBalancerFactory()->setRequestInfo( [
	'IPAddress' => $wgRequest->getIP(),
	
] );

To construct the DBLoadBalancerFactory service, it spend 2% of the time doing:

  • creates LBFactory
  • creates WANObjectCache
  • creates getStatsdDataFactory
  • creates and initialises MemcachedPeclBagOStuff

Screenshot 2019-07-24 at 16.44.47.png (794×2 px, 265 KB)

And to compute WebRequest::getIP it spends another 1%:

capture.png (607×2 px, 114 KB)

Let's look at what these constructors are doing and defer what we can so that the instantiation is a cheap as it can be.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Krinkle triaged this task as High priority.Jul 25 2019, 9:18 PM

As most things from T189966, this would be nice to resolve before the new rest.php entrypoint goes live.

Change 532367 had a related patch set uploaded (by TK-999; owner: TK-999):
[mediawiki/core@master] rdbms: Defer instantiating the DBLoadBalancerFactory service

https://gerrit.wikimedia.org/r/532367

I've just uploaded a patch to defer this explicit instantiation. However, the DBLoadBalancerFactory will still be implicitly instantiated on every request if MediaWiki is configured to use a shared database, due to the following:

includes/Setup.php
if ( $wgSharedDB && $wgSharedTables ) {
	// Apply $wgSharedDB table aliases for the local LB (all non-foreign DB connections)
	MediaWikiServices::getInstance()->getDBLoadBalancer()->setTableAliases(
		array_fill_keys(
			$wgSharedTables,
			[
				'dbname' => $wgSharedDB,
				'schema' => $wgSharedSchema,
				'prefix' => $wgSharedPrefix
			]
		)
	);
}

I feel this should be handled separately.

Change 532367 abandoned by TK-999:
rdbms: Defer instantiating the DBLoadBalancerFactory service

Reason:
The outcome of T218555 is that WebRequest should not be made into a service or used as a service dependency, a decision that this patch is not compatible with.

https://gerrit.wikimedia.org/r/532367

I've looked at this again, and I think avoiding construction might not be useful here.

While it is certainly true that some requests do not need to connect to the database (especially on load.php etc.), I do think virtually all end points will need to be able to connect to it if they want to. This means they would still create the LBFactory service and pass it down as needed.

I already see this happening for ResourceLoader where, as we're adopting more DI over the past few months, it now too has LBFactory injected via service wiring. This means that even if we avoid it in Setup.php, it would still happen on the second line in load.php just the same.

So instead of avoiding instantiation, I'll see what we can do to defer or optimise more of the work it is doing in all these constructors.

Change 580086 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] Remove $wgProfileLimit setting

https://gerrit.wikimedia.org/r/580086

Change 580086 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] Remove $wgProfileLimit setting

https://gerrit.wikimedia.org/r/580086

This performs a very minor optimizations in Profiler::instance() which is called as part of MWLBFactory::applyDefaultConfig via MediaWikiServices::getDBLoadBalancerFactory.

Change 580086 merged by jenkins-bot:
[mediawiki/core@master] Remove $wgProfileLimit setting

https://gerrit.wikimedia.org/r/580086

It seems the vast majority of this time is spent setting up the whole MonologSpi chain and the dozens of SyslogHandler instances and wrappers and other handler objects (see wmf-config/logging.php).

I wonder if some of that could be simplifed or made more lazy rather than upfront.

Screenshot 2020-03-30 at 20.27.40.png (1×2 px, 242 KB)

https://performance.wikimedia.org/arclamp/svgs/daily/2020-03-29.excimer.load.svgz

This task is about early initialisation of LBFactory before/without it being needed. A similar issue was encountered with Maintenance scripts in order to support Maintenance::getDbType() === DB_NONE (ref T238436, T237148). This is currently solved by having the call to Maintenance::setAgentAndTriggers be conditional during process setup.

As part of this task I'd like to explore moving both of these process-specific data feeders for LBFactory, to ServiceWiring (likely via MWLBFactory). That way they naturally only happen when the service is called upon. And thus becomes compatible with MediaWikiServices::disableStorageBackend(), which currently requires ad-hoc conditionals to avoid triggering these bits of setup calls.

Change 677729 had a related patch set uploaded (by Krinkle; author: Krinkle):

[mediawiki/core@master] rdbms: Remove unused 'hostname' in LBFactory and LoadBalancer

https://gerrit.wikimedia.org/r/677729

Change 677730 had a related patch set uploaded (by Krinkle; author: Krinkle):

[mediawiki/core@master] rdbms: Move setLBFactoryTriggers from Maintenance to MWLBFactory

https://gerrit.wikimedia.org/r/677730

Change 677729 merged by jenkins-bot:

[mediawiki/core@master] rdbms: Remove unused 'hostname' in LBFactory and LoadBalancer

https://gerrit.wikimedia.org/r/677729

Change 679984 had a related patch set uploaded (by Krinkle; author: Krinkle):

[mediawiki/core@master] Setup: Move $wgSharedDB init to ServiceWiring

https://gerrit.wikimedia.org/r/679984

Change 679991 had a related patch set uploaded (by Krinkle; author: Krinkle):

[mediawiki/core@master] Setup: Move LBFactory::setRequestInfo() to service wiring

https://gerrit.wikimedia.org/r/679991

Change 682772 had a related patch set uploaded (by Krinkle; author: Krinkle):

[mediawiki/core@master] rdbms: Use REQUEST_TIME_FLOAT for LBFactory::getCPInfoFromCookieValue

https://gerrit.wikimedia.org/r/682772

Change 679991 merged by jenkins-bot:

[mediawiki/core@master] Setup: Move LBFactory::setRequestInfo() to service wiring

https://gerrit.wikimedia.org/r/679991

Change 677730 merged by jenkins-bot:

[mediawiki/core@master] rdbms: Move setLBFactoryTriggers from doMaintenance to service wiring

https://gerrit.wikimedia.org/r/677730

Krinkle renamed this task from 2% of load.php spent creating DBLoadBalancerFactory in WebStart.php to 3% of load.php spent creating DBLoadBalancerFactory in WebStart.php.Apr 28 2021, 10:19 PM
Krinkle updated the task description. (Show Details)

Change 682772 merged by jenkins-bot:

[mediawiki/core@master] rdbms: Use REQUEST_TIME_FLOAT for LBFactory::getCPInfoFromCookieValue

https://gerrit.wikimedia.org/r/682772

In comparing last week to the week before, I also noticed a regression that was introduced in WebStart performance since the filing of this task.

  1. A new whopping 4.8% chunk has appeared inside this portion of the chart, relating to MediaWikiServices::getInstance most of which is due to loading HookRunner.php. This regression is tracked at T274041.

With this as our base line, let us look at today with the above changes applied:

The DBLoadBalancer cost has gone, however the WebRequest::getIP has not. I forgot that there is another early unconditional call, namely for the wfDebug log. I think once that one is avoided, we can probably have most "simple" web requests reach their pre-send stage without this method being called since afaik there are no other callers here in unconditional code (codesearch).

The MediaWikiServices regression has unconfortunately shadowed the cost this task was about, so we might not yet see a movement between the histogram buckets. One step at a time...

Based on an XHGui profile with all call pairs and counts, I found another indirect getIP call which is for MediaWiki\Session\SessionManager::singleton()->logPotentialSessionLeakage(); also in Setup.php. That's the only other one on load.php and action=raw index.php reqs.

For pageviews and edits etc. there's naturally more, such as for block detection and "my" links for unreg users with a session. Also for any reqs there's other calls to getIP for any Logstash message, but the key thing to keep in mind is that this formatting only happpens when a message isn't filtered out by debuglog/severity configuration, and the vast vast majority of web requests in production never ever send any message to Logstash, so despite being in Monolog/WikiProcessor and that being instantiated on every request, it's only actually called quite rarely when a notable diagnostic message or uncaught exception gets logged etc.

So my next focus will be on undoing that autoloader cost rather than avoiding getIP calls.

Change 679984 abandoned by Krinkle:

[mediawiki/core@master] Setup: Move $wgSharedDB init to ServiceWiring

Reason:

https://gerrit.wikimedia.org/r/679984

This comment was removed by Krinkle.

Now it's only 1.3%, so is this resolved? This doesn't seem the right place to discuss unrelated startup time issues.

Krinkle claimed this task.