Page MenuHomePhabricator

decommission wdqs100[12]
Closed, ResolvedPublic

Description

Once wdqs100[45] are setup and serving traffic (T171210), wdqs100[12] can be decommissioned (see docs).

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp (replace with role::spare if system isn't shut down immediately during this process.) -> replaced with role::spare::system

START NON-INTERRUPPTABLE STEPS

  • - disable puppet on hosts
  • - remove all remaining puppet references (include role::spare)
  • - power down hosts
  • - disable switch ports
  • - switch port assignment noted on this task (for later removal) wdqs1001:asw2-d-eqiad:ge-3/0/15 & wdqs1002:asw-c-eqiad:ge-7/0/12
  • - remove production dns entries
  • - puppet node clean, puppet node deactivate, salt key removed

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite)
  • - system unracked and decommissioned (by onsite), update racktables with result
  • - switch port configration removed from switch once system is unracked.
  • - mgmt dns entries removed.

Event Timeline

Mentioned in SAL (#wikimedia-operations) [2017-09-13T16:28:31Z] <gehel> starting decommissioning of wdqs100[12] - T175595

Change 377797 had a related patch set uploaded (by Gehel; owner: Gehel):
[operations/puppet@production] wdqs - decommissioning wdqs100[12]

https://gerrit.wikimedia.org/r/377797

Was ldf server moved from wdqs1001? If not we should move it first thing.

Change 377797 merged by Gehel:
[operations/puppet@production] wdqs - decommissioning wdqs100[12]

https://gerrit.wikimedia.org/r/377797

Gehel moved this task from In progress to Done on the Discovery-Wikidata-Query-Service-Sprint board.
Gehel added a subscriber: RobH.

@RobH I think my job is done here, let me know if you need anything else from me.

Change 377801 had a related patch set uploaded (by Gehel; owner: Gehel):
[wikidata/query/deploy@master] decomission wdqs100[12]

https://gerrit.wikimedia.org/r/377801

Change 377801 merged by Smalyshev:
[wikidata/query/deploy@master] decomission wdqs100[12]

https://gerrit.wikimedia.org/r/377801

I cannot find wdqs1001 on the network switch stack for row d (either old asw or new asw2 stack). @Cmjohnson will need to unplug the actual network cable.

Change 379282 had a related patch set uploaded (by RobH; owner: RobH):
[operations/dns@master] decom wdqs100[12]

https://gerrit.wikimedia.org/r/379282

@RobH updated ge-3/0/15 still had asset tag listed.

Change 379282 merged by RobH:
[operations/dns@master] decom wdqs100[12]

https://gerrit.wikimedia.org/r/379282

Change 379287 had a related patch set uploaded (by RobH; owner: RobH):
[operations/puppet@production] decom of wdqs100[12]

https://gerrit.wikimedia.org/r/379287

Change 379287 merged by RobH:
[operations/puppet@production] decom of wdqs100[12]

https://gerrit.wikimedia.org/r/379287

Assigned to @Cmjohnson for onsite followup.

RobH edited projects, added ops-eqiad; removed Patch-For-Review.
RobH moved this task from Backlog to Decommission on the ops-eqiad board.
RobH triaged this task as Low priority.Sep 20 2017, 6:15 PM
RobH updated the task description. (Show Details)

Removing this from the WDQS board, nothing more to do on our side...

Change 425333 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/dns@master] Removing mgmt dns for wdqs1001/2

https://gerrit.wikimedia.org/r/425333

Change 425333 merged by Cmjohnson:
[operations/dns@master] Removing mgmt dns for wdqs1001/2

https://gerrit.wikimedia.org/r/425333

Cmjohnson updated the task description. (Show Details)

Removed from rack, updated tracking sheet