Page MenuHomePhabricator

Decommission db1015
Closed, ResolvedPublic

Description

db1015 is ready to be decommissioned:

  • - all system services confirmed offline from production use: Removed from mediawiki-config: https://gerrit.wikimedia.org/r/#/c/372816/
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/heira/dsh config removed
  • - remove site.pp (replace with role::spare if system isn't shut down immediately during this process.):
  • Host set to spare until @Cmjohnson removes it forever. https://gerrit.wikimedia.org/r/#/c/372818/

START NON-INTERRUPPTABLE STEPS

  • - disable puppet on host
  • - remove all remaining puppet references (include role::spare)
  • - power down host
  • - disable switch port & change switch port label to asset tag
  • - remove production dns entries & remove hostname entries in mgmt dns
  • - puppet node clean, puppet node deactivate, salt key removed

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite)
  • - remove hostname label, remove hostname from visible label field in racktables (by onsite)
  • - system added back to spares tracking (by onsite)

Event Timeline

Restricted Application added a project: Operations. · View Herald TranscriptAug 18 2017, 12:50 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Marostegui moved this task from Triage to Next on the DBA board.Aug 18 2017, 12:50 PM
Marostegui updated the task description. (Show Details)

Change 372816 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad,db-codfw.php: Remove db1015

https://gerrit.wikimedia.org/r/372816

Change 372817 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/software@master] s3.hosts: Remove db1015

https://gerrit.wikimedia.org/r/372817

Change 372818 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] mariadb: Remove db1015

https://gerrit.wikimedia.org/r/372818

Change 372816 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad,db-codfw.php: Remove db1015

https://gerrit.wikimedia.org/r/372816

Mentioned in SAL (#wikimedia-operations) [2017-08-21T09:15:52Z] <marostegui@tin> Synchronized wmf-config/db-codfw.php: Remove db1015 - T173570 (duration: 00m 44s)

Mentioned in SAL (#wikimedia-operations) [2017-08-21T09:17:16Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Remove db1015 - T173570 (duration: 00m 44s)

Change 372818 merged by Marostegui:
[operations/puppet@production] mariadb: Remove db1015

https://gerrit.wikimedia.org/r/372818

Change 372817 merged by jenkins-bot:
[operations/software@master] s3.hosts: Remove db1015

https://gerrit.wikimedia.org/r/372817

Mentioned in SAL (#wikimedia-operations) [2017-08-21T12:35:47Z] <marostegui> Stop MySQL on db1015 to decommission it - T173570

Marostegui updated the task description. (Show Details)
Marostegui moved this task from Next to Blocked external/Not db team on the DBA board.

This host is now ready for the remaining steps from @Cmjohnson

Cmjohnson moved this task from Backlog to Decommission on the ops-eqiad board.Aug 21 2017, 10:16 PM

Change 373534 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] mariadb: Remove yaml files from db1015 and db1041

https://gerrit.wikimedia.org/r/373534

Change 373534 merged by Marostegui:
[operations/puppet@production] mariadb: Remove yaml files from db1015 and db1041

https://gerrit.wikimedia.org/r/373534

Mentioned in SAL (#wikimedia-operations) [2017-10-23T10:28:03Z] <marostegui> Remove /srv from db1015 as it has been stopped for weeks now and will be decommissioned (and it is alerting low on disk space) - T173570

@Marostegui during my decom checks I found db1015 in this file. Should a replacement be identified?

modules/admin/files/enforce-users-groups.sh

@Marostegui during my decom checks I found db1015 in this file. Should a replacement be identified?

modules/admin/files/enforce-users-groups.sh

Hi Chris,

No worries about that, you can safely decommission this host.
Thanks for the heads up!

Cmjohnson updated the task description. (Show Details)Dec 12 2017, 5:42 PM
Cmjohnson updated the task description. (Show Details)Dec 19 2017, 8:18 PM

This host still shows up in puppetdb, i.e. misses the deactivate step (e.g. visible in https://servermon.wikimedia.org/hosts/)

Cmjohnson closed this task as Resolved.Jan 10 2018, 3:47 PM