Page MenuHomePhabricator

Restore the map data health and parity between clusters
Closed, ResolvedPublic

Description

From T218097#8157470:

... DC switchover /.../ brought old and broken data in the codfw cluster. We're assessing the situation now and coordinating with SRE on the switchover to the most updated cluster and restoring the health and parity between the clusters as soon as possible.

There have been problems with this parity since the beginning of this year, at least. OSM replication has been disabled occasionally on either or both clusters, and traffic has been switched back-and-forth.

It looks like so far a task didn't exist for this issue. So hopefully this task can be used to gather related tasks, and to provide status updates. This general issue is likely the cause for several recent user facing issues, including T313010, T313011, T312950, T312751, T312913, T307099, T306865, T311679. Possibly these could be set as subtasks here, or possibly they could be just merged, once assured that they are not also affected by some other issue.

Related Objects

Event Timeline

The main issues seem to be solved here. Whenever this becomes an urgent matter again there should be a new task I guess. Thanks to everyone involved.