Page MenuHomePhabricator

Make $wgSessionCacheType and $wgMainStash caches multi-DC ready
Closed, DeclinedPublic

Description

In session.php we have:

$sessionRedis = array(
	'eqiad' => array( '127.0.0.1:6380' ),
	'codfw' => array( '127.0.0.1:6380' ),
);

Right now, MW in each DC can only talk to the local redis cluster.

We want each server to have an eqiad/codfw netcracker instance, one for local redis, the other for remote redis. READ_LATEST reads need to use 'eqiad' redis cluster, even if remote.

For replication to actually work, I see two obvious options:
a) Set up cross-DC replication in puppet for each corresponding server and have MW use ReplicatedBagOStuff. We have 18 hash tags for nutcracker for each "slot", with two servers using any given tag (one in each DC). One problem with this is that if a master fails, the master DC does failover for writes, but the slave DC is still looking in the wrong place.
b) Use MultiWriteBagOStuff with 'async' mode for the remote redis cluster. This avoids the problem in (a) though it also is "best effort" only. If something fails to write to both it never will (not without additional code/mechanisms). For the data in there now (ephemeral and temporary), we could actually tolerate that.

Event Timeline

aaron created this task.Sep 4 2015, 9:58 PM
aaron claimed this task.
aaron raised the priority of this task from to Normal.
aaron updated the task description. (Show Details)
aaron removed a project: Patch-For-Review.
aaron set Security to None.
aaron added subscribers: Krinkle, jcrespo, Glaisher and 12 others.
aaron renamed this task from Move 'session' cache in wmf-config to use ReplicatedBagOStuff with local and master nutcracker instances to Make the 'sessions' redis cache multi-DC ready.Oct 2 2015, 7:28 PM
aaron updated the task description. (Show Details)

Change 244325 had a related patch set uploaded (by Ori.livneh):
Make the 'sessions' redis cache multi-DC ready

https://gerrit.wikimedia.org/r/244325

Change 244325 merged by Ori.livneh:
Make the 'sessions' redis cache multi-DC ready

https://gerrit.wikimedia.org/r/244325

Change 244361 had a related patch set uploaded (by Ori.livneh):
Make the redis cache configuration multi-DC-ready

https://gerrit.wikimedia.org/r/244361

ori added a subscriber: ori.Oct 7 2015, 10:12 PM

@aaron and I discussed this and I think we agreed on option (b) (MultiWriteBagOStuff w/async) for now, simply because the scope of the multi-DC work is already huge and in order to have any ability to plan at all, we need to nail down what we can.

Change 244361 merged by jenkins-bot:
Make the redis cache configuration multi-DC-ready

https://gerrit.wikimedia.org/r/244361

ori added a comment.Oct 8 2015, 5:27 AM

app servers now have separate RedisBagOStuff instances defined for each cluster. @aaron, I'm leaving the rest (defining a MultiWriteBagOStuff instance, etc.) for you.

aaron renamed this task from Make the 'sessions' redis cache multi-DC ready to Make wgSessionCacheType and the main stash redis cache multi-DC ready.Oct 19 2015, 5:00 PM

Change 247325 had a related patch set uploaded (by Aaron Schulz):
Made the session/main stashes write to both DCs

https://gerrit.wikimedia.org/r/247325

Change 247325 abandoned by Aaron Schulz:
Made the session/main stashes write to both DCs

Reason:
Probably going with something like https://phabricator.wikimedia.org/T134811 instead for TLS

https://gerrit.wikimedia.org/r/247325

bd808 removed a subscriber: bd808.Jun 10 2016, 1:27 AM
aaron renamed this task from Make wgSessionCacheType and the main stash redis cache multi-DC ready to Make wgSessionCacheType and the main stash cache multi-DC ready.Jun 15 2016, 4:11 PM
aaron renamed this task from Make wgSessionCacheType and the main stash cache multi-DC ready to Make $wgSessionCacheType and $wgMainStash caches multi-DC ready.Aug 12 2016, 4:54 AM
aaron closed this task as Declined.Sep 4 2016, 12:33 PM

No need to have this and T134811 open.