Page MenuHomePhabricator

Document ToolsDB failover process for Clouddb Admins
Closed, ResolvedPublic

Description

Document a general failover process in Wikitech for Clouddb Admins to follow for emergency failover of ToolsDB in case DBAs are not available to help with the process.

Event Timeline

Bstorm triaged this task as Medium priority.Feb 21 2019, 6:54 PM
Bstorm created this task.
Bstorm added a subscriber: Andrew.

Testing is something to think about here as well because it would be very good, and yet it would require coordinating an outage with the four users with non-replicated tables.

For now, the doc is pretty good. Got help from the DBAs--also a reminder that we should be doing regular failover testing rather than things being quite so hard to do.

For now, the doc is pretty good. Got help from the DBAs--also a reminder that we should be doing regular failover testing rather than things being quite so hard to do.

Lets schedule a controlled failover in a couple of months!

I'd love to...except that we haven't set up a viable solution for the four non-replicated tables. They may be able to dump their tables (after we contact them) in preparation so that they can re-instate them after the failover. That may be the way forward with that, now that I moved wikilabels to its own replicated pair.