Page MenuHomePhabricator

Example configuration clauses for using RESTBagOStuff with Kask
Closed, ResolvedPublic

Description

We need the configuration instructions necessary for using RESTBagOStuff and Kask as our session store, and leaving the main stash where it is.

I think this means:

  • an addition to $wgObjectCaches to define the store (I suggest 'kask-session' as the key)
  • an addition to $wgObjectCaches to define the transitional store per T222742 using MultiwriteBagOStuff (I suggest 'kask-transition')
  • setting $wgSessionCacheType to 'kask-transition'
  • setting $wgObjectCacheSessionExpiry to the same value as is configured for kask (9 * 3600?)
  • A comment above/near the $wgObjectCacheSessionExpiry reminding the reader that setting the session expiry to something different than what is configured in Kask will give unexpected results, so don't do that.

At some point we'd need to switch $wgSessionCacheType to 'kask-session'. Since our configuration code is PHP, there's a temptation to do something like check the date, and use the different stores based on how much time has passed since the switchover date, but... that's probably being too clever.

Event Timeline

T220401 contains deployment details. Here's the relevant part, in action:

bpirkle@mwmaint1002:~$ curl https://sessionstore.discovery.wmnet:8081/sessions/v1/this-fake-key-should-not-exist
{
  "type": "https://www.mediawiki.org/wiki/Kask/errors/not_found",
  "title": "Not found",
  "detail": "The value you requested was not found",
  "instance": "/sessions/v1/this-fake-key-should-not-exist"
}

A search of the mediawiki-config repository for "wgObjectCaches" finds occurrences in:

mediawiki-config/wmf-config/CommonSettings.php (MultiWriteBagOStuff, key name mysql-multiwrite)
mediawiki-config/wmf-config/mc-labs.php (MemcachedPeclBagOStuff, key name mcrouter)
mediawiki-config/wmf-config/mc.php (MemcachedPeclBagOStuff, key names memcached-pecl and mcrouter)
mediawiki-config/wmf-config/redis.php (RedisBagOStuff, key names redis_master and redis_local)

A search for "wgSessionCacheType" finds, in file wmf-config/InitialiseSettings.php,

'wgSessionCacheType' => [
	'default' => 'redis_local',  // declared in redis.php
	'wikitech' => 'memcached-pecl',
],

If I'm reading this right, everything except wikitech currently uses its local RedisBagOStuff (as defined in redis.php) while wikitech uses a MemcachedPeclBagOStuff (as defined in mc.php). Using eval.php to dump live configuration values for a few wikis (including enwiki and labswiki) seems to confirm this.

It looks like we should add our $wgObjectCaches entries to mediawiki-config/wmf-config/CommonSettings.php, so that they'll be accessible everywhere, and then change wgSessionCacheType in wmf-config/InitialiseSettings.php. Specifically:

CommonSettings.php:

$wgObjectCaches['kask-session'] = [
	'class' => 'RESTBagOStuff',
	'url' => 'https://sessionstore.discovery.wmnet:8081/sessions/v1/',
	'httpParams' => [
		'writeHeaders' => [
			'content-type' => 'application/octet-stream',
		],
		'writeMethod' => 'POST',
	],
	'extendedErrorBodyFields' => [ 'type', 'title', 'detail', 'instance' ]
];
$wgObjectCaches['kask-transition'] = [
	'class' => 'MultiWriteBagOStuff',
	‘caches’ => [
		0 => [
			'factory' => [ 'ObjectCache', 'getInstance' ],
			'args' => [ ‘redis_local’ ]
		],
		1 => [
			'class' => 'RESTBagOStuff',
			'url' => 'https://sessionstore.discovery.wmnet:8081/sessions/v1/',
			'httpParams' => [
				'writeHeaders' => [
					'content-type' => 'application/octet-stream',
				],
				'writeMethod' => 'POST',
			],
			'extendedErrorBodyFields' => [ 'type', 'title', 'detail', 'instance' ]
		]

	]
	'replication' => 'async',
	'reportDupes' => false
];

InitialiseSettings.php:

'wgSessionCacheType' => [
	'default' => 'kask-transition', // change this to kask-session after the transition is complete
	'wikitech' => 'memcached-pecl',
],

I'm assuming we want all wikis currently using redis_local to switch to kask-transition. If we want any of them to stay on redis_local, we can adjust InitialiseSettings.php accordingly.

Note that redis_local is used for other things (wgMainStash), so we should not alter or remove it, even once the transition is complete. We should simply stop using it for sessions. We *can* remove kask-transition once the transition is complete, if we like.

I didn't find wgObjectCacheSessionExpiry anywhere in the mediawiki-config repository, so I think all wikis are using the MediaWiki core default value from includes/DefaultSettings.php. This means we don't have a foundation-specific override of that variable to add the comment about making that setting match Kask (unless we add an override just so we can add a comment, which seems silly). I suggest a generic comment in DefaultSettings.php, like:

/**
 * The expiry time to use for session storage, in seconds.
 * If using external session storage (ex. Kask), you probably want to match its expiry time.
 */
$wgObjectCacheSessionExpiry = 3600;

If no one says they see any problems with the above, I'll get Gerrit changes together. Also, we still want to do a staging release for testing before deploying this to all wikis, per T222099.

One issue here is that you're planning to hard-code the session store host into CommonSettings.php - that's not great as it will not work for labs for example. The common pattern here is to utilize $wmfLocalServices for hosts and then add app the path portion of the URI in CommonSettings.

Second, why are we redefining the kask-session config within kask-transition? Can we just use the same pattern as for redis_local and just use ObjectCache::getInstance('kask-session') to obtain an already defined instance?

  • setting $wgObjectCacheSessionExpiry to the same value as is configured for kask (9 * 3600?)

Session storage (Kask) is currently configured for 86400 seconds (taken from the default configuration), if this is correct, it is by accident only. IOW, whatever this value should be, Kask will need to be (re)configured as well.

One issue here is that you're planning to hard-code the session store host into CommonSettings.php - that's not great as it will not work for labs for example. The common pattern here is to utilize $wmfLocalServices for hosts and then add app the path portion of the URI in CommonSettings.

Second, why are we redefining the kask-session config within kask-transition? Can we just use the same pattern as for redis_local and just use ObjectCache::getInstance('kask-session') to obtain an already defined instance?

Thank you, I will do both of those as you suggest.

To confirm, instead of storing the host in CommonSettings.php, I will store it in ProductionServices.php, LabsServices.php, and TestServices.php (probably using a "kask" key), and I can then do something like "$wmfLocalServices['kask']" in CommonSettings.php. Do I understand that correctly?

To confirm, instead of storing the host in CommonSettings.php, I will store it in ProductionServices.php, LabsServices.php, and TestServices.php (probably using a "kask" key), and I can then do something like "$wmfLocalServices['kask']" in CommonSettings.php. Do I understand that correctly?

Exactly.

Prospective configuration change pushed for review under T222099. That change will transition only testwiki to use Kask, for testing. Once T222099 is closed and testing is complete, next steps will be:

  1. deploy a config change to switch 'default' value for wgSessionCacheType to 'kask-transition'
  2. confirm sessions are being happily stored to Kask and that logs are clear
  3. wait a sufficient time period so that all sessions are in Kask ($wgObjectCacheSessionExpiry is currently 3600)
  4. push a config change to:
    1. change 'default' value for wgSessionCacheType from 'kask-transition' to 'kask-session', thereby removing Redis and putting sessions solely in Kask
    2. change 'default' $wgObjectCacheSessionExpiry to match the Kask setting (if Kask uses something other than 3600)
    3. remove 'testwiki' overrides, so that it follows the 'default' setting
  5. confirm sessions are happily functioning and logs are clear

Note One: during the transition, testwiki will still be using a TTL ($wgObjectCacheSessionExpiry) value of 3600. The Kask TTL setting must therefore be at least 3600 seconds. If the Kask setting is longer, the transition will still be okay. But we will do more writes to Kask than we would with a longer settings, because MediaWiki will think the sessions need to be rewritten.

Note Two: removing the 'testwiki' overrides will not change anything for testwiki, because the new 'default' settings will match the 'testwiki' testing settings. Removing the overrides is just to simply configuration, because there will no longer be any need to override.

Note Three: I will push the config change for #1 above to Gerrit after testing under T222099 is complete. This is to ensure that no one misunderstands and deploys it early.

After discussion with @WDoranWMF and @daniel , it might be prudent to deploy to production in multiple steps. One option would be:

  1. deploy a config change to use testwiki as a staging server, under T222099
  2. test, check logs, and monitor until we are satisfied that testwiki and Kask are happy
  3. deploy a config change to make all wikis except English Wikipedia and Wikidata use Kask
    1. set default session storage to Kask
    2. add special cases for English Wikipedia and Wikidata so that they continue to use Redis for session storage
  4. test, check logs, and monitor until we are satisfied that the additional wikis are happy, and that Kask is still happy
  5. deploy a config change that removes the English Wikipedia and Wikidata special cases
  6. test, check logs, and monitor until we are satisfied that the entire system is happy
eprodromou subscribed.

So, this is done, right?