Page MenuHomePhabricator

Add automatic temporary feature disabling to encourage migration away from long deprecated feature
Open, Needs TriagePublicFeature

Description

In cases like this, it'd be really useful to have some sort of functionality like GitHub used in https://developer.github.com/changes/2020-02-14-deprecating-password-auth/ - where we can temporarily make things error in a hope people pay some more attention and fix their stuff...

I've been thinking about the same thing. The easiest thing to implement would be to 500 with a very low probability, but I like the solid chunk of time for a breakage much more. Perhaps add a config variable and do 2 consecutive config deploys?

Event Timeline

I don't know how we'd plumb it in, this potentially needs to be done in different "places" ad-hoc, so the API one might be a good place to do this first/try it. We probably don't need any sort of major generic framework around this....

I guess we can put some specific "check" in place we want to fail in a harder fashion, with a name, and then with a $wg that uses that name to set a time ranges.

if ( wfCheckForTemporaryDisable( 'feature-name' ) {
	$this->dieWithError( 'message', 'code', 'data', 500 );
}

And then wfCheckForTemporaryDisable does something like

function wfCheckForTemporaryDisable( $name ) {
	global $wgDisableThing;
	$time = time();
	return isset( $wgDisableThing[$name] ) &&
		$wgDisableThing[$name]['start'] >= $time &&
		$wgDisableThing[$name]['end'] <= $time;
}
$wgDisableThing = [
	'feature-name' => [ 'start' => 1627770000, 'end' => 1627790000 ]
];

I don't know if just causing 500s (or similar) is necessarily the best way, and could cause error spikes...

So extra logging might probably necessary, and maybe in the APIs case, dieing out with an error that isn't quite an internal server error...

And in the spirit of T254646: Reconsidering how we name things, I think we should avoid the term "brownout".

I don't know if just causing 500s (or similar) is necessarily the best way, and could cause error spikes...

So extra logging might probably necessary, and maybe in the APIs case, dieing out with an error that isn't quite an internal server error...

And as a further thought.. Just dying with 500s may cause bug reports etc from those that haven't seen it. And may be difficult to search for

If we respond with an error code, a useful message, link to the relevant bug etc, we're doing the communication at point of usage. And obviously, we don't respond with the actual normal output, like we do with warnings (which get ignored :)).

If they're not handling errors in a good fashion... Well, that's not our fault either :)

Aklapper changed the subtype of this task from "Task" to "Feature Request".Aug 2 2021, 6:39 AM
$wgDisableThing = [
	'feature-name' => [ 'start' => 1627770000, 'end' => 1627790000 ]
];

I'd include the ability to have multiple shutdown periods defined for one feature, for example if you want to fail a certain percentage of requests during one time period and immediately after for another period fail all requests. Additionally I'd make it a service (name proposal: FeatureShutdown) instead of a global function.

+1 to the general concept.

To start with I think we could just do it manually with some if ( mt_rand( ... ) == 0 ) blocks at registration points and then building global variables/functions/services once we have a better idea on how it gets used.

Change 722940 had a related patch set uploaded (by Majavah; author: Majavah):

[mediawiki/core@master] Introduce FeatureShutdown

https://gerrit.wikimedia.org/r/722940