Page MenuHomePhabricator

Decommission db1060
Closed, ResolvedPublic

Description

db1060 is currently sanitarium master for s2
db1102 needs to be switched over db1074 before proceeding with this decommissioning

  • Change master db1102 to db1074

Decommission Checklist

  • - all system services confirmed offline from production use - should be done by DBA team
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration - should be done by DBA team: Removed from config: https://gerrit.wikimedia.org/r/#/c/431703/
  • - any service group puppet/heira/dsh config removed - should be done by DBA team
  • - remove site.pp (replace with role(spare::system) if system isn't shut down immediately during this process.) - should be done by DBA team: https://gerrit.wikimedia.org/r/#/c/431704/

START NON-INTERRUPPTABLE STEPS

  • - disable puppet on host
  • - power down host
  • - disable switch port
  • - switch port assignment noted on this task (for later removal) asw2-c-eqiad:ge-2/0/18
  • - remove all remaining puppet references (include role::spare)
  • - remove production dns entries
  • - puppet node clean, puppet node deactivate
  • - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite)
  • - IF DECOM: system unracked and decommissioned (by onsite), update racktables with result
  • - IF DECOM: switch port configration removed from switch once system is unracked.
  • - IF DECOM: add system to decommission tracking google sheet
  • - IF DECOM: mgmt dns entries removed.

Related Objects

StatusSubtypeAssignedTask
ResolvedNone
ResolvedCmjohnson

Event Timeline

Marostegui triaged this task as Medium priority.May 3 2018, 1:07 PM
Marostegui created this task.

Change 430598 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] db1060: Disable notifications

https://gerrit.wikimedia.org/r/430598

Change 430598 merged by Marostegui:
[operations/puppet@production] db1060: Disable notifications

https://gerrit.wikimedia.org/r/430598

Change 431516 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Depool db1074

https://gerrit.wikimedia.org/r/431516

Change 431516 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Depool db1074

https://gerrit.wikimedia.org/r/431516

Mentioned in SAL (#wikimedia-operations) [2018-05-07T07:18:32Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Depool db1074 - T193732 (duration: 00m 59s)

Mentioned in SAL (#wikimedia-operations) [2018-05-07T07:19:51Z] <marostegui> Stop replication in sync on db1060 and db1074 - T193732

Mentioned in SAL (#wikimedia-operations) [2018-05-07T07:32:22Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Repool db1074 - T193732 (duration: 00m 59s)

Change 431520 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: db1074 is now master for sanitarium

https://gerrit.wikimedia.org/r/431520

Change 431520 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: db1074 is now master for sanitarium

https://gerrit.wikimedia.org/r/431520

Mentioned in SAL (#wikimedia-operations) [2018-05-07T07:37:37Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: db1074 is now db1102's master - T193732 (duration: 00m 59s)

db1060 is no longer db1102's master.
Let's give it 24h before proceeding with the decommissioning tasks

Change 431703 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad,db-codfw.php: Remove db1060 from config

https://gerrit.wikimedia.org/r/431703

Change 431703 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad,db-codfw.php: Remove db1060 from config

https://gerrit.wikimedia.org/r/431703

Mentioned in SAL (#wikimedia-operations) [2018-05-08T06:25:30Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Remove db1060 from config - T193732 (duration: 00m 59s)

Mentioned in SAL (#wikimedia-operations) [2018-05-08T06:26:37Z] <marostegui@tin> Synchronized wmf-config/db-codfw.php: Remove db1060 from config - T193732 (duration: 01m 01s)

Change 431704 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] mariadb: Set db1060 as spare

https://gerrit.wikimedia.org/r/431704

Change 431704 merged by Marostegui:
[operations/puppet@production] mariadb: Set db1060 as spare

https://gerrit.wikimedia.org/r/431704

Mentioned in SAL (#wikimedia-operations) [2018-05-08T06:51:16Z] <marostegui> Stop MySQL on db1060 as it will be decommissioned - T193732

Change 431705 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/software@master] s2.hosts: Remove db1060

https://gerrit.wikimedia.org/r/431705

Change 431705 merged by jenkins-bot:
[operations/software@master] s2.hosts: Remove db1060

https://gerrit.wikimedia.org/r/431705

Marostegui moved this task from In progress to Done on the DBA board.

This is ready for @RobH and DC-Ops to take over

Vvjjkkii renamed this task from Decommission db1060 to vpdaaaaaaa.Jul 1 2018, 1:12 AM
Vvjjkkii removed RobH as the assignee of this task.
Vvjjkkii raised the priority of this task from Medium to High.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed subscribers: gerritbot, Aklapper.
Marostegui renamed this task from vpdaaaaaaa to Decommission db1060.Jul 1 2018, 6:30 PM
Marostegui assigned this task to Cmjohnson.
Marostegui lowered the priority of this task from High to Medium.
Marostegui updated the task description. (Show Details)
RobH updated the task description. (Show Details)
RobH updated the task description. (Show Details)

Change 447097 had a related patch set uploaded (by RobH; owner: RobH):
[operations/puppet@production] decom db1060

https://gerrit.wikimedia.org/r/447097

Change 447097 merged by RobH:
[operations/puppet@production] decom db1060

https://gerrit.wikimedia.org/r/447097

Change 447098 had a related patch set uploaded (by RobH; owner: RobH):
[operations/dns@master] decom of db1060 prod dns

https://gerrit.wikimedia.org/r/447098

Change 447098 merged by RobH:
[operations/dns@master] decom of db1060 prod dns

https://gerrit.wikimedia.org/r/447098

RobH removed a project: Patch-For-Review.
RobH updated the task description. (Show Details)
RobH moved this task from Backlog to pending onsite steps (eqiad) on the decommission-hardware board.
Cmjohnson updated the task description. (Show Details)