Page MenuHomePhabricator

SiteConfiguration::getConfig() does not work in Wikimedia production
Closed, ResolvedPublic

Description

When you try to use SiteConfiguration::getConfig() to get other wiki's configurations, the code tries to call out to maintenance/getConfiguration.php via shell. The problem is that Maintenance.php has this check:

		if ( isset( $_SERVER ) && isset( $_SERVER['REQUEST_METHOD'] ) ) {
			$this->error( 'This script must be run from the command line', true );
		}

Which fails since the request inherits REQUEST_METHOD from the webserver request via environment. If you do putenv('REQUEST_METHOD'); before the call (which eliminates REQUEST_METHOD from the environment) the call works.

Event Timeline

Smalyshev raised the priority of this task from to Needs Triage.
Smalyshev updated the task description. (Show Details)
Smalyshev added a subscriber: Smalyshev.

SiteConfiguration::getConfig() uses wfShellExec() to call maintenance/getConfiguration.php. wfShellExec() has a parameter that allows you to pass env vars to the executed process. One way to make this codepath possible again would be to add a guard condition like && !isset( $_ENV['VIA_WFSHELLEXEC'] ) to the cli guard and have SiteConfiguration::getConfig() set that env var.

This idea is not an endorsement of calling maintenance/getConfiguration.php from within a web request in the first place but it looks like someone thought that was a good/ok idea in 1.21 and it was subsequently broken by other attempts to protect maintenance scripts from arbitrary web execution. Maybe requiring that VIA_WFSHELLEXEC has some typically non-public & per farm value would be a good idea too?

Ideally, there would be an API that allows to load configuration of certain wiki without shell-out, but that might require more effort.

Ideally, there would be an API that allows to load configuration of certain wiki without shell-out, but that might require more effort.

I think that might have to wait for the Configuration database 2 RfC or some other system that brings structured and standardized organization to wikifarm configuration. As things stand today, MediaWiki has the SiteConfiguration helper but there is no standard method for how this gets populated or used in the wikifarm's runtime.

I suppose there could be some api action that exported a wiki's configuration data but it would need some sort of tightly controlled authorization mechanism which would then lead to the consuming wiki needing to hold OAuth or other authentication credentials to access the config of a foreign wiki.

I don't mean API as in api.php call, I mean internal API that I could call. Basically, SiteConfiguration without the shell-out and related nastyness. Not sure it it's easy to do though given that everything relies on globals :(

I don't mean API as in api.php call, I mean internal API that I could call. Basically, SiteConfiguration without the shell-out and related nastyness.

*nod* that's what I was thinking of with the Configuration database 2 RfC reference. Things are actually even crazier than the globals for the WMF wikifarm due to all the interesting gymnastics we do in rOMWC Wikimedia - MediaWiki Config.

Legoktm renamed this task from SiteConfiguration::getConfig() does not work in production to SiteConfiguration::getConfig() does not work in Wikimedia production.Sep 19 2015, 12:58 AM

I don't think shelling out like this is a good idea. $wgConf->get() should generally work on Wikimedia sites though.

Having the web service shelling out to mwscript maintenance/getConfiguration.php eventually caused account creation to stop entirely and have lead to the rollback of 1.28.0-wmf.19.

I am making this a blocker.

So, what's the status/next step here?

For our needs in the cluster, I might suggest defining an internal api endpoint (similar to how /rpc/RunJobs.php is an internal only endpoint) that can take the necessary arguments and output the requested configuration? This wouldn't be particularly hard to do if we only meet the needs of WMF production, defining something more generic might be a more involved task.

Change 396301 had a related patch set uploaded (by Tim Starling; owner: Tim Starling):
[mediawiki/core@master] Fix maintenance script failure when run as a child of a FastCGI worker

https://gerrit.wikimedia.org/r/396301

Incredible how a single line of rubbish code I wrote in 2004 can have so many people scratching their heads for so long. I hadn't seen this task before today.

Change 396301 merged by jenkins-bot:
[mediawiki/core@master] Fix maintenance script failure when run as a child of a FastCGI worker

https://gerrit.wikimedia.org/r/396301

Umherirrender assigned this task to tstarling.
Umherirrender triaged this task as Medium priority.