Page MenuHomePhabricator

Decommission kafka1018
Closed, ResolvedPublic


In T181518 we swapped kafka1018 with kafka1023 due to an unrecoverable hw failure.

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - add role::spare in site.pp


  • - disable puppet on host - CANNOT, HOST WONT BOOT
  • - remove all remaining puppet references (include role::spare)
  • - power down host - CANNOT, HOST WONT BOOT
  • - disable switch port
  • - switch port assignment noted on this task (for later removal)
  • - remove production dns entries
  • - puppet node clean, puppet node deactivate


  • - system disks wiped (by onsite)
  • - IF DECOM: system unracked and decommissioned (by onsite), update racktables with result
  • - IF DECOM: switch port configration removed from switch once system is unracked.
  • - IF DECOM: mgmt dns entries removed.

Related Objects

Event Timeline

elukey created this task.Dec 15 2017, 8:24 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptDec 15 2017, 8:24 AM
fdans moved this task from Incoming to Radar on the Analytics board.Dec 18 2017, 4:22 PM
Cmjohnson moved this task from Backlog to Decommission on the ops-eqiad board.Jan 2 2018, 4:09 PM
Ottomata triaged this task as Normal priority.Jan 16 2018, 7:32 PM
RobH added a subscriber: RobH.

All decommissioning should be tagged with #hw-requests.

RobH claimed this task.Feb 7 2018, 8:39 PM
RobH updated the task description. (Show Details)
RobH added a comment.Feb 7 2018, 8:44 PM

So I cannot see kafka1018 on the switch stack in row D. @Cmjohnson, I cannot actually finish the non-interrupt steps, since the port isn't noted.

The host is currently powered off, due to its mainboard failing. So it shouldn't have the issue of it coming back online, however the port should be traced onsite and disabled.

Change 408870 had a related patch set uploaded (by RobH; owner: RobH):
[operations/puppet@production] kafka1018 decommission

Change 408871 had a related patch set uploaded (by RobH; owner: RobH):
[operations/dns@master] kafka1018 decom, production dns

Change 408870 merged by RobH:
[operations/puppet@production] kafka1018 decommission

Change 408871 merged by RobH:
[operations/dns@master] kafka1018 decom, production dns

RobH reassigned this task from RobH to Cmjohnson.Feb 7 2018, 8:54 PM
RobH removed a project: Patch-For-Review.
RobH updated the task description. (Show Details)

Ok, ready for on-site wipe and unracking (plus the tracing and disabling of the switch port)

Cmjohnson moved this task from Decommission to Up next on the ops-eqiad board.Mar 28 2018, 5:50 PM

Change 427412 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/dns@master] Removing mgmt dns from kafka1018

Change 427412 merged by Cmjohnson:
[operations/dns@master] Removing mgmt dns from kafka1018

Cmjohnson updated the task description. (Show Details)Apr 18 2018, 3:45 PM
Cmjohnson closed this task as Resolved.

removed from rack and network port updated (ge-8/0/0). updated racktables and tracking sheet