Page MenuHomePhabricator

Document ToolsDB failover process for Clouddb Admins
Closed, ResolvedPublic

Description

Document a general failover process in Wikitech for Clouddb Admins to follow for emergency failover of ToolsDB in case DBAs are not available to help with the process.

Event Timeline

Bstorm created this task.Feb 21 2019, 6:54 PM
Bstorm triaged this task as Normal priority.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 21 2019, 6:54 PM
Bstorm added a subscriber: Andrew.

Testing is something to think about here as well because it would be very good, and yet it would require coordinating an outage with the four users with non-replicated tables.

bd808 moved this task from Backlog to ToolsDB on the Data-Services board.Mar 5 2019, 4:16 PM
Bstorm closed this task as Resolved.Fri, May 24, 11:40 PM

For now, the doc is pretty good. Got help from the DBAs--also a reminder that we should be doing regular failover testing rather than things being quite so hard to do.

For now, the doc is pretty good. Got help from the DBAs--also a reminder that we should be doing regular failover testing rather than things being quite so hard to do.

Lets schedule a controlled failover in a couple of months!

I'd love to...except that we haven't set up a viable solution for the four non-replicated tables. They may be able to dump their tables (after we contact them) in preparation so that they can re-instate them after the failover. That may be the way forward with that, now that I moved wikilabels to its own replicated pair.