Decommission parsercache hosts: pc1004.eqiad.wmnet pc1005.eqiad.wmnet pc1006.eqiad.wmnet
Open, HighPublic

Description

pc1004 pc1005 and pc1006 leases expires the 31st Dec 2018 and they need to be returned (T204556) - so don't just stack with the other decoms, this is a high priority for return this month (December 2018!)

The new hosts are online (T208383)

pc1004

Decommission Checklist

START NON-INTERRUPPTABLE STEPS - please assign to @RobH for the non-interrupt steps

  • - disable puppet on host
  • - power down host
  • - update status in netbox (inventory for decom, planned for spare)
  • - disable switch port
  • - switch port assignment noted on this task (for later removal)
  • - remove all remaining puppet references (include role::spare)
  • - remove production dns entries
  • - puppet node clean, puppet node deactivate (handled by wmf-decommission-host)
  • - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key (handled by wmf-decommission-host)

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite)
  • - IF DECOM: system unracked and decommissioned (by onsite), update racktables with result
  • - IF DECOM: switch port configration removed from switch once system is unracked.
  • - IF DECOM: add system to decommission tracking google sheet
  • - IF DECOM: mgmt dns entries removed.

pc1005

Decommission Checklist

START NON-INTERRUPPTABLE STEPS - please assign to @RobH for the non-interrupt steps

  • - disable puppet on host
  • - power down host
  • - update status in netbox (inventory for decom, planned for spare)
  • - disable switch port
  • - switch port assignment noted on this task (for later removal)
  • - remove all remaining puppet references (include role::spare)
  • - remove production dns entries
  • - puppet node clean, puppet node deactivate (handled by wmf-decommission-host)
  • - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key (handled by wmf-decommission-host)

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite)
  • - IF DECOM: system unracked and decommissioned (by onsite), update racktables with result
  • - IF DECOM: switch port configration removed from switch once system is unracked.
  • - IF DECOM: add system to decommission tracking google sheet
  • - IF DECOM: mgmt dns entries removed.

pc1006

Decommission Checklist

START NON-INTERRUPPTABLE STEPS - please assign to @RobH for the non-interrupt steps

  • - disable puppet on host
  • - power down host
  • - update status in netbox (inventory for decom, planned for spare)
  • - disable switch port
  • - switch port assignment noted on this task (for later removal)
  • - remove all remaining puppet references (include role::spare)
  • - remove production dns entries
  • - puppet node clean, puppet node deactivate (handled by wmf-decommission-host)
  • - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key (handled by wmf-decommission-host)

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite)
  • - IF DECOM: system unracked and decommissioned (by onsite), update racktables with result
  • - IF DECOM: switch port configration removed from switch once system is unracked.
  • - IF DECOM: add system to decommission tracking google sheet
  • - IF DECOM: mgmt dns entries removed.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMon, Dec 3, 6:23 AM
Marostegui moved this task from Triage to In progress on the DBA board.Mon, Dec 3, 6:23 AM
Marostegui triaged this task as High priority.
Marostegui updated the task description. (Show Details)Mon, Dec 3, 6:26 AM

Change 477190 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] pc1004,1005,1006: Disable notifications

https://gerrit.wikimedia.org/r/477190

Change 477190 merged by Marostegui:
[operations/puppet@production] pc1004,1005,1006: Disable notifications

https://gerrit.wikimedia.org/r/477190

Marostegui updated the task description. (Show Details)Mon, Dec 3, 6:36 AM

Change 477192 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] mariadb: Decommission pc1004,pc1005 and pc1006

https://gerrit.wikimedia.org/r/477192

Change 477192 merged by Marostegui:
[operations/puppet@production] mariadb: Decommission pc1004,pc1005 and pc1006

https://gerrit.wikimedia.org/r/477192

Marostegui updated the task description. (Show Details)Mon, Dec 3, 6:49 AM

Mentioned in SAL (#wikimedia-operations) [2018-12-03T06:52:50Z] <marostegui> Remove pc1004, pc1005 and pc1006 from tendril and zarcillo - T210969

Mentioned in SAL (#wikimedia-operations) [2018-12-03T07:09:13Z] <marostegui> Stop MySQL on pc1004, pc1005 and pc1006 as they will be decommissioned - T210969

Marostegui updated the task description. (Show Details)Mon, Dec 3, 7:11 AM
Marostegui reassigned this task from Marostegui to RobH.
Marostegui moved this task from In progress to Done on the DBA board.
Marostegui added a subscriber: Cmjohnson.

pc1004, pc1005 and pc1006 are now fully ready for DC-Ops to take over and finish their decommission.

Restricted Application added a project: Operations. · View Herald TranscriptMon, Dec 3, 7:12 AM

Priority is high like T209858: Decommission parsercache hosts: pc2004 pc2005 pc2006 (Dec 2018 lease return) because these have a hard deadline on the lease expiration

Marostegui added a parent task: Unknown Object (Task).Mon, Dec 3, 7:14 AM
Marostegui mentioned this in Unknown Object (Task).Mon, Dec 3, 7:16 AM
RobH moved this task from pending onsite steps (eqiad) to Backlog on the decommission board.

wmf-decommission-host was executed by robh for pc1004.eqiad.wmnet and performed the following actions:

  • Revoked Puppet certificate
  • Removed from PuppetDB
  • Downtimed host on Icinga
  • Downtimed mgmt interface on Icinga
  • Removed from DebMonitor

wmf-decommission-host was executed by robh for pc1005.eqiad.wmnet and performed the following actions:

  • Revoked Puppet certificate
  • Removed from PuppetDB
  • Downtimed host on Icinga
  • Downtimed mgmt interface on Icinga
  • Removed from DebMonitor

wmf-decommission-host was executed by robh for pc1006.eqiad.wmnet and performed the following actions:

  • Revoked Puppet certificate
  • Removed from PuppetDB
  • Downtimed host on Icinga
  • Downtimed mgmt interface on Icinga
  • Removed from DebMonitor
RobH added a comment.Mon, Dec 3, 9:34 PM

Switch ports noted for @Cmjohnson to clear their descriptions once they are unracked:

pc1004 asw-a-eqiad:ge-3/0/18
pc1005 asw2-c-eqiad:ge-7/0/17
pc1006 asw2-d-eqaid:ge-3/0/30

RobH updated the task description. (Show Details)
RobH updated the task description. (Show Details)
RobH reassigned this task from RobH to Cmjohnson.
RobH moved this task from Backlog to pending onsite steps (eqiad) on the decommission board.
RobH moved this task from Backlog to Decommission on the ops-eqiad board.