Page MenuHomePhabricator

Be able to switch programmatically between deployment servers in codfw and eqiad
Closed, ResolvedPublic

Description

We have two deployment servers, one for each main datacenter. We want to be able to switch between them as the deployment master server at will.

What we need for that to happen is:

  1. Add a logical host name that can point to the currently active deployment master. Users should connect to it when they want to deploy any software.
  2. Check that mira is correctly configured and is present in all puppet declarations regarding deployments
  3. Set up a sync system for the trebuchet redis server and the /srv/deployment directory
  4. Try the switch at least once (this will also allow to reimage tin while mira is the master).

scap is already able to work in active-active mode from two servers, so there is no need for changing anything.

An alternative would be to do make the two servers active-active as masters, but that would require quite some work on trebuchet.

Event Timeline

Joe raised the priority of this task from to High.
Joe updated the task description. (Show Details)
Joe added projects: SRE, HHVM.
Joe added subscribers: ori, RobH, mmodell and 17 others.

Change 264943 had a related patch set uploaded (by Giuseppe Lavagetto):
scap: use logical names for the rsync master

https://gerrit.wikimedia.org/r/264943

Change 264944 had a related patch set uploaded (by Giuseppe Lavagetto):
role::deployment: make it possible to switch between different servers

https://gerrit.wikimedia.org/r/264944

Change 264945 had a related patch set uploaded (by Giuseppe Lavagetto):
deployment: activate redis replica between the masters

https://gerrit.wikimedia.org/r/264945

Change 264943 merged by Giuseppe Lavagetto:
scap: use logical names for the rsync master

https://gerrit.wikimedia.org/r/264943

Change 264944 merged by Giuseppe Lavagetto:
role::deployment: make it possible to switch between different servers

https://gerrit.wikimedia.org/r/264944

Change 264945 merged by Giuseppe Lavagetto:
deployment: activate redis replica between the masters

https://gerrit.wikimedia.org/r/264945

Everything is in place for a switchover test, now scheduled for Monday 25th of january.

The test went well (sort of, we had an outage due to an operational error) and we're ready to switch back to tin, probably this week.

I will switch back to tin on monday, February 15th, and document all the needed steps on wikitech. This ticket can be considered to be resolved after that.

Change 270710 had a related patch set uploaded (by Giuseppe Lavagetto):
deployment: switch back to tin

https://gerrit.wikimedia.org/r/270710

Change 270713 had a related patch set uploaded (by Giuseppe Lavagetto):
deployment: switch back to tin

https://gerrit.wikimedia.org/r/270713

Change 270713 merged by Giuseppe Lavagetto:
deployment: switch back to tin

https://gerrit.wikimedia.org/r/270713

Change 270710 merged by Giuseppe Lavagetto:
deployment: switch back to tin

https://gerrit.wikimedia.org/r/270710