In session.php we have:
$sessionRedis = array( 'eqiad' => array( '127.0.0.1:6380' ), 'codfw' => array( '127.0.0.1:6380' ), );
Right now, MW in each DC can only talk to the local redis cluster.
We want each server to have an eqiad/codfw netcracker instance, one for local redis, the other for remote redis. READ_LATEST reads need to use 'eqiad' redis cluster, even if remote.
For replication to actually work, I see two obvious options:
a) Set up cross-DC replication in puppet for each corresponding server and have MW use ReplicatedBagOStuff. We have 18 hash tags for nutcracker for each "slot", with two servers using any given tag (one in each DC). One problem with this is that if a master fails, the master DC does failover for writes, but the slave DC is still looking in the wrong place.
b) Use MultiWriteBagOStuff with 'async' mode for the remote redis cluster. This avoids the problem in (a) though it also is "best effort" only. If something fails to write to both it never will (not without additional code/mechanisms). For the data in there now (ephemeral and temporary), we could actually tolerate that.