Page MenuHomePhabricator

Decommission db1054
Closed, ResolvedPublic

Description

db1054 was s2 primary master and was failed over to db1066 (T194870)

Let's wait a couple of days before decommissioning

  • Set up a new candidate master for s2 - db1076
  • Compare data between db1054 and db1076

Decommission Checklist

  • - all system services confirmed offline from production use - should be done by DBA team
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration - should be done by DBA team
  • - any service group puppet/heira/dsh config removed - should be done by DBA team
  • - remove site.pp (replace with role(spare::system) if system isn't shut down immediately during this process.) - should be done by DBA team: https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/442014/

START NON-INTERRUPPTABLE STEPS

  • - disable puppet on host
  • - power down host
  • - disable switch port
  • - switch port assignment noted on this task (for later removal) asw-a-eqiad:ge-3/0/32
  • - remove all remaining puppet references (include role::spare)
  • - remove production dns entries
  • - puppet node clean, puppet node deactivate

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite)
  • - IF DECOM: system unracked and decommissioned (by onsite), update racktables with result
  • - IF DECOM: switch port configration removed from switch once system is unracked.
  • - IF DECOM: add system to decommission tracking google sheet
  • - IF DECOM: mgmt dns entries removed.
  • - IF RECLAIM: system added back to spares tracking (by onsite)

Event Timeline

Marostegui triaged this task as Medium priority.Jun 13 2018, 6:38 AM
Marostegui moved this task from Triage to In progress on the DBA board.

Change 440068 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Depool db1076

https://gerrit.wikimedia.org/r/440068

Change 440068 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Depool db1076

https://gerrit.wikimedia.org/r/440068

Mentioned in SAL (#wikimedia-operations) [2018-06-13T08:04:17Z] <marostegui@deploy1001> Synchronized wmf-config/db-eqiad.php: Depool db1076 for binlog change - T197063 (duration: 00m 57s)

Mentioned in SAL (#wikimedia-operations) [2018-06-13T08:04:36Z] <marostegui> Stop MySQL and reboot db1076 - T197063

main tables have been checked without any differences.

Change 442014 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] mariadb: Set db1054 as spare

https://gerrit.wikimedia.org/r/442014

Change 442014 merged by Marostegui:
[operations/puppet@production] mariadb: Set db1054 as spare

https://gerrit.wikimedia.org/r/442014

Mentioned in SAL (#wikimedia-operations) [2018-06-26T05:52:12Z] <marostegui> Stop MySQL on db1054 as it is going to be decommissioned - T197063

Change 442015 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad,db-codfw.php: Remove db1054

https://gerrit.wikimedia.org/r/442015

Change 442015 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad,db-codfw.php: Remove db1054

https://gerrit.wikimedia.org/r/442015

Mentioned in SAL (#wikimedia-operations) [2018-06-26T05:57:16Z] <marostegui@deploy1001> Synchronized wmf-config/db-eqiad.php: Remove db1054, it is going to be decommissioned T197063 (duration: 00m 57s)

Mentioned in SAL (#wikimedia-operations) [2018-06-26T05:58:17Z] <marostegui@deploy1001> Synchronized wmf-config/db-codfw.php: Remove db1054, it is going to be decommissioned T197063 (duration: 00m 55s)

Marostegui updated the task description. (Show Details)
Marostegui moved this task from In progress to Done on the DBA board.

db1054 is now ready to be handed over to DCOps for its decommissioning

Vvjjkkii renamed this task from Decommission db1054 to c5aaaaaaaa.Jul 1 2018, 1:04 AM
Vvjjkkii removed Cmjohnson as the assignee of this task.
Vvjjkkii raised the priority of this task from Medium to High.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii edited subscribers, added: Cmjohnson; removed: gerritbot, Aklapper.
Marostegui renamed this task from c5aaaaaaaa to Decommission db1054.Jul 1 2018, 8:11 PM
Marostegui assigned this task to Cmjohnson.
Marostegui lowered the priority of this task from High to Medium.
Marostegui updated the task description. (Show Details)

Change 445722 had a related patch set uploaded (by RobH; owner: RobH):
[operations/dns@master] decom db1054

https://gerrit.wikimedia.org/r/445722

Change 445723 had a related patch set uploaded (by RobH; owner: RobH):
[operations/puppet@production] decom db1054 from repo

https://gerrit.wikimedia.org/r/445723

Change 445722 merged by RobH:
[operations/dns@master] decom db1054

https://gerrit.wikimedia.org/r/445722

Change 445723 merged by RobH:
[operations/puppet@production] decom db1054 from repo

https://gerrit.wikimedia.org/r/445723

RobH removed a project: Patch-For-Review.
RobH updated the task description. (Show Details)
RobH moved this task from Backlog to pending onsite steps (eqiad) on the decommission-hardware board.
Cmjohnson updated the task description. (Show Details)