Page MenuHomePhabricator

Repurpose labtestpuppetmaster2001.wikimedia.org as cloudcephmon2003-dev.codfw.wmnet
Closed, ResolvedPublic

Description

New VMs will be using an in-cloud puppetmaster, so this hardware is no longer needed for puppet mastering.

This will be moved to an internal IP in cloud-hosts1-b-codfw

Event Timeline

Andrew created this task.Jul 15 2020, 7:59 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 15 2020, 7:59 PM

Change 612940 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] Change labtestpuppetmaster2001 to role(spare::system)

https://gerrit.wikimedia.org/r/612940

Change 612940 merged by Andrew Bogott:
[operations/puppet@production] Change labtestpuppetmaster2001 to role(spare::system)

https://gerrit.wikimedia.org/r/612940

Andrew triaged this task as Medium priority.Jul 15 2020, 8:16 PM
Andrew added a comment.EditedOct 23 2020, 2:38 PM

decided: this will become cloudcephmon200x-dev (T266257)

Andrew renamed this task from Decide fate of labtestpuppetmaster2001.wikimedia.org to Repurpose labtestpuppetmaster2001.wikimedia.org as cloudcephmon2003-dev.codfw.wmnet.Nov 16 2020, 6:37 PM

Change 642418 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] Rename labtestpuppetmaster2001.wikimedia.org to cloudcephmon2003-dev.codfw.wmnet

https://gerrit.wikimedia.org/r/642418

Change 642419 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] Remove stale references to labtestpuppetmaster2001.wikimedia.org

https://gerrit.wikimedia.org/r/642419

Change 642418 merged by Andrew Bogott:
[operations/puppet@production] Rename labtestpuppetmaster2001.wikimedia.org to cloudcephmon2003-dev.codfw.wmnet

https://gerrit.wikimedia.org/r/642418

cookbooks.sre.hosts.decommission executed by andrew@cumin1001 for hosts: labtestpuppetmaster2001.wikimedia.org

  • labtestpuppetmaster2001.wikimedia.org (FAIL)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Failed to power off, manual intervention required: Remote IPMI for labtestpuppetmaster2001.mgmt.codfw.wmnet failed (exit=1): b''
    • Host steps raised exception: The request failed with code 500 Internal Server Error but more specific details were not returned in json. Check the NetBox Logs or investigate this exception's error attribute.

ERROR: some step on some host failed, check the bolded items above

Andrew updated the task description. (Show Details)Nov 20 2020, 4:26 PM

cookbooks.sre.hosts.decommission executed by volans@cumin1001 for hosts: labtestpuppetmaster2001.wikimedia.org

  • labtestpuppetmaster2001.wikimedia.org (FAIL)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Failed to power off, manual intervention required: Remote IPMI for labtestpuppetmaster2001.mgmt.codfw.wmnet failed (exit=1): b''
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • COMMON_STEPS (WARN)
    • Not all affected DC(s) have been migrated to automatic DNS, a manual patch to the operations/dns repository is required

ERROR: some step on some host failed, check the bolded items above

Andrew reassigned this task from Andrew to Papaul.Nov 20 2020, 5:07 PM
Andrew added a subscriber: Papaul.

This needs to be moved to row B before it can connect to cloud-hosts1-b-codfw. Leaving that in @Papaul's hands for now.

Change 642472 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/dns@master] Remove references to labtestpuppetmaster2001

https://gerrit.wikimedia.org/r/642472

Change 642472 merged by Andrew Bogott:
[operations/dns@master] Remove references to labtestpuppetmaster2001

https://gerrit.wikimedia.org/r/642472

@Andrew can you please power down this server before i get on stie. if there is nothing running on it i can do it.
Thanks.

It's idle, you can power it down whenever.

switch information ge-8/0/17

[edit interfaces interface-range disabled]
-    member ge-8/0/17;
[edit interfaces interface-range vlan-cloud-hosts1-b-codfw]
     member ge-8/0/5 { ... }
+    member ge-8/0/17;
[edit interfaces]
+   ge-8/0/17 {
+       description "cloudcephmon2003-dev:##PRIMARY## {#}";
+   }
Papaul reassigned this task from Papaul to Andrew.Nov 23 2020, 4:08 PM

Done on my end

papaul@asw-b-codfw> show interfaces ge-8/0/17 descriptions
Interface       Admin Link Description
ge-8/0/17       up    up   cloudcephmon2003-dev:##PRIMARY## {#}

Change 642419 merged by Andrew Bogott:
[operations/puppet@production] Remove stale references to labtestpuppetmaster2001.wikimedia.org

https://gerrit.wikimedia.org/r/642419

Change 643315 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] Separate partman recipe for cloudcephmon2003-dev

https://gerrit.wikimedia.org/r/643315

Change 643315 merged by Andrew Bogott:
[operations/puppet@production] Separate partman recipe for cloudcephmon2003-dev

https://gerrit.wikimedia.org/r/643315

Change 643322 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] Further partman attempts for cloudcephmon2003-dev

https://gerrit.wikimedia.org/r/643322

Change 643322 merged by Andrew Bogott:
[operations/puppet@production] Further partman attempts for cloudcephmon2003-dev

https://gerrit.wikimedia.org/r/643322

Mentioned in SAL (#wikimedia-operations) [2020-11-26T14:23:08Z] <moritzm> remove labtestpuppetmaster2001 from debmonitor T258103

Andrew closed this task as Resolved.Dec 4 2020, 2:44 AM