Page MenuHomePhabricator

labtestneutron2001: reimage to stretch & rename to cloudnet2001-dev
Closed, ResolvedPublic

Description

Use this server to try the upgrade from jessie to stretch.

Since trying the new puppet code involves reimaging, let's rename to the modern naming scheme while at it.

Timeline would be:

  • disable puppet in labtestneutron2001
  • merge puppet patch to rename and get the new debian installer working
  • merge dns patch to add the new FQDNs (partial, the old mgmt names still remains)
  • run the wmf-auto-reimage-host script
  • merge DNS cleanup patch
  • netbox update
  • get the physical relabeling done (T214181)
  • done

By the way, this server is standby in the neutron setup in the labtestn deployment.

Event Timeline

aborrero triaged this task as Medium priority.Jan 18 2019, 1:50 PM
aborrero created this task.

Change 485185 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] labtestneutron2001: reimage to stretch and rename to cloudnet2001-dev

https://gerrit.wikimedia.org/r/485185

Change 485187 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/dns@master] labtestneutron2001: rename to cloudnet2001-dev

https://gerrit.wikimedia.org/r/485187

Change 485185 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] labtestneutron2001: reimage to stretch and rename to cloudnet2001-dev

https://gerrit.wikimedia.org/r/485185

Change 485187 merged by Arturo Borrero Gonzalez:
[operations/dns@master] labtestneutron2001: rename to cloudnet2001-dev

https://gerrit.wikimedia.org/r/485187

Script wmf-auto-reimage was launched by aborrero on cumin1001.eqiad.wmnet for hosts:

labtestneutron2001.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/201901181613_aborrero_249597_labtestneutron2001_codfw_wmnet.log.

Mentioned in SAL (#wikimedia-operations) [2019-01-18T16:14:50Z] <arturo> T214167 reimage+rename labtestneutron2001.codfw.wmnet (jessie) to cloudnet2001-dev.codfw.wmnet (stretch)

Completed auto-reimage of hosts:

['cloudnet2001-dev.codfw.wmnet']

Of which those FAILED:

['cloudnet2001-dev.codfw.wmnet']

The host was successfully reimaged+renamed, despite the ops-monitoring-bot message.

Change 485613 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/dns@master] labtestneutron2001: cleanup

https://gerrit.wikimedia.org/r/485613

Change 485613 merged by Arturo Borrero Gonzalez:
[operations/dns@master] labtestneutron2001: cleanup

https://gerrit.wikimedia.org/r/485613

aborrero updated the task description. (Show Details)

cloudnet2001-dev - Check systemd state - CRITICAL - degraded: The system is operational but one or more units failed.

https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=cloudnet2001-dev&service=Check+systemd+state

cloudnet2001-dev - Check systemd state - CRITICAL - degraded: The system is operational but one or more units failed.

https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=cloudnet2001-dev&service=Check+systemd+state

Thanks! should be solved now.

Currently we have the following alerts:

cloudcontrol2001-dev - systemd state
cloudnet2002-dev - systemd state
cloudnet2003-dev - Check whether microcode mitigations for CPU vulnerabilities are applied, DPKG state
cloudvirt2001-dev - DPKG
cloudvirt2002-dev - DPKG
cloudvirt2003-dev -systemd state