Page MenuHomePhabricator

decom californium
Closed, ResolvedPublic

Description

californium.wikimedia.org has been set to spare and marked as ready for decom in site.pp
(T168470)

It should follow the full decom template steps:


  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp (replace with role::spare::system if system isn't shut down immediately during this process.)

START NON-INTERRUPPTABLE STEPS

  • - disable puppet on host
  • - remove all remaining puppet references (include role::spare)
  • - power down host
  • - disable switch port
  • - switch port assignment noted on this task (for later removal) - asw2-b-eqiad:ge-4/0/38
  • - remove production dns entries
  • - puppet node clean, puppet node deactivate

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite)
  • - IF DECOM: system unracked and decommissioned (by onsite), update racktables with result
  • - IF DECOM: switch port configration removed from switch once system is unracked.
  • - IF DECOM: mgmt dns entries removed.
  • - IF DECOM: add system to decommission tracking google sheet
  • - update netbox status to offline when unracked

Related Objects

StatusSubtypeAssignedTask
ResolvedJclark-ctr
ResolvedPRODUCTION ERRORAndrew
ResolvedAndrew
ResolvedAndrew
ResolvedAndrew
ResolvedAndrew
ResolvedAndrew
ResolvedAndrew
Resolvedbd808
ResolvedAndrew
ResolvedAndrew
ResolvedAndrew
ResolvedAndrew
ResolvedAndrew
ResolvedAndrew
ResolvedAndrew
ResolvedAndrew
ResolvedAndrew
ResolvedAndrew
ResolvedAndrew
ResolvedAndrew
ResolvedPRODUCTION ERRORAndrew

Event Timeline

Dzahn triaged this task as Medium priority.Mar 17 2018, 1:20 AM
Dzahn created this task.

Change 420145 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/dns@master] remove californium.wikimedia.org

https://gerrit.wikimedia.org/r/420145

Change 420144 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] site: remove mapped IPv6 from californium

https://gerrit.wikimedia.org/r/420144

Dzahn removed Andrew as the assignee of this task.Mar 17 2018, 1:22 AM
Dzahn updated the task description. (Show Details)
Dzahn edited projects, added hardware-requests; removed Patch-For-Review.
Dzahn edited subscribers, added: RobH, Cmjohnson; removed: bd808, Paladox, Krenair and 7 others.

Change 420147 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] decom californium from site and install_server

https://gerrit.wikimedia.org/r/420147

Change 420147 abandoned by Dzahn:
decom californium from site and install_server

https://gerrit.wikimedia.org/r/420147

Change 420145 abandoned by Dzahn:
remove californium.wikimedia.org

https://gerrit.wikimedia.org/r/420145

Change 420144 abandoned by Dzahn:
site: remove mapped IPv6 from californium

https://gerrit.wikimedia.org/r/420144

@Dzahn why remove all the subscribers?

Because when you click "create subtask" on a parent task Phab copies all the subscribers over, but even though they were interested in the parent task, i'm pretty sure they don't want another 2 dozen notifications about the details of the decom workflow.

In T189921#4063407, Dzahn wrote:

Dzahn why remove all the subscribers?

Because when you click "create subtask" on a parent task Phab copies all the subscribers over, but even though they were interested in the parent task, i'm pretty sure they don't want another 2 dozen notifications about the details of the decom workflow.

Ah, thanks for explaining it~

Change 413748 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] m5: remove grants for Californium

https://gerrit.wikimedia.org/r/413748

Change 421048 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] Remove californium from smart_health_wikimedia_labs

https://gerrit.wikimedia.org/r/421048

Change 413748 merged by Andrew Bogott:
[operations/puppet@production] m5: remove grants for Californium

https://gerrit.wikimedia.org/r/413748

Change 421048 merged by Andrew Bogott:
[operations/puppet@production] Remove californium from smart_health_wikimedia_labs

https://gerrit.wikimedia.org/r/421048

Volans added a subscriber: Volans.

@RobH FYI I've used this host for a test-reimage and the host doesn't want to reboot into PXE, it times out and reboots into the old OS. Given that it was already removed from puppet, instead of fixing a to-be-decommissioned host I went ahead and just disable puppet and shut it down. I've updated the checkbox in the description accordingly.

Change 454237 had a related patch set uploaded (by Muehlenhoff; owner: Muehlenhoff):
[operations/puppet@production] Remove californium

https://gerrit.wikimedia.org/r/454237

Change 454237 merged by Muehlenhoff:
[operations/puppet@production] Remove californium

https://gerrit.wikimedia.org/r/454237

wmf-decommission-host was executed by volans for californium.wikimedia.org and performed the following actions:

  • Revoked Puppet certificate
  • Removed from PuppetDB
  • Skipped downtime host on Icinga (likely already removed)
  • Skipped downtime mgmt interface on Icinga (likely already removed)
  • Removed from DebMonitor

Change 481195 had a related patch set uploaded (by RobH; owner: RobH):
[operations/dns@master] decom californium production dns entries

https://gerrit.wikimedia.org/r/481195

RobH added a project: ops-eqiad.
RobH updated the task description. (Show Details)
RobH moved this task from Backlog to pending onsite steps (eqiad) on the decommission-hardware board.
RobH moved this task from Backlog to Decommission on the ops-eqiad board.
RobH removed a project: Patch-For-Review.
RobH updated the task description. (Show Details)

Mentioned in SAL (#wikimedia-cloud) [2019-06-05T22:44:40Z] <Krenair> Updating 'proxy' security group rules for port 5668 to remove decommissioned IP - 208.80.154.147 californium T189921

Cmjohnson raised the priority of this task from Medium to High.

John,

Can you wipe this server and remove from the rack as soon as you can. Need the space.

Thanks!

Change 531295 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/dns@master] Removing mgmt dns for californium

https://gerrit.wikimedia.org/r/531295

Change 531295 merged by Cmjohnson:
[operations/dns@master] Removing mgmt dns for californium

https://gerrit.wikimedia.org/r/531295

Jclark-ctr updated the task description. (Show Details)
Jclark-ctr added a subscriber: Jclark-ctr.

Disk wiped, Removed host from Netbox and racks placed in storage.

Papaul updated the task description. (Show Details)

Complete

Change 481195 abandoned by RobH:
[operations/dns@master] decom californium production dns entries

Reason:
old neglected patchset, no longer needed.

https://gerrit.wikimedia.org/r/481195