Page MenuHomePhabricator

Figure out and document the datacenter switchover process
Closed, ResolvedPublic

Description

We need to figure out the process for switching from eqiad to codfw and back in our current hot/cold setup. We previously did something similar for the pmtpa->eqiad switchover, but this is wildly out of date by now. We'll need a new checklist and one we should keep up-to-date going forward.

Related Objects

StatusSubtypeAssignedTask
InvalidNone
Resolvedjcrespo
ResolvedKrinkle
ResolvedNone
Resolvedjcrespo
Resolvedjcrespo
Resolvedjcrespo
ResolvedJoe
Resolved Cmjohnson
Resolved Cmjohnson
ResolvedJoe
ResolvedJoe
ResolvedRobH
Resolvedelukey
ResolvedJoe
ResolvedKrinkle
Resolvedaaron
ResolvedKrinkle
Resolvedelukey
Resolvedelukey
ResolvedJoe
ResolvedJoe
Resolvedjcrespo

Event Timeline

faidon raised the priority of this task from to Medium.
faidon updated the task description. (Show Details)
faidon subscribed.

Change 275814 had a related patch set uploaded (by Giuseppe Lavagetto):
parsoid::testing: use master_dc variables

https://gerrit.wikimedia.org/r/275814

Change 275814 merged by Giuseppe Lavagetto:
parsoid::testing: use master_dc variables

https://gerrit.wikimedia.org/r/275814

This is a duplicate, but I would merge T114398 into it, as this one has activity.

In T114398, @aaron wrote:>

See also T114271.

We need scripts and processes to do a planned switch from master datacenter A to B:

a) Go read-only on the app level (mostly MediaWiki)
b) Make sure write traffic stops
c) Go read-only for all data stores
d) Wait for all data stores in the B datacenter to catch up and be in sync with A
e) Make the B datacenter the new master datacenter (systems and app level)
f) End read-only mode

Read-only mode should be as short as possible so we can actually test this.

"we should keep up-to-date going forward" is not really a finite actionable task, I would consider this Resolved. There are multiple things to fix- but on the process itself (e.g. better database orchestration), not only on the documentation.

jcrespo claimed this task.