Page MenuHomePhabricator

relocate/reimage cloudvirt1025 with 10G interfaces
Closed, ResolvedPublic

Description

This host is now drained and ready for re-networking, rebuild, etc.

cloudvirt1025:

  • - put system offline in all checks for maint window
  • - relocate to 10G rack and update netbox
  • - update switch configuration for new primary 10G
  • - enable PXE for 10G primary interface.
  • - attach/cable secondary 10G port for instance traffic, update switch config.
  • - remove old switch config for 1G ports
  • - (update firmware?)
  • - PXE boot and reimage system
  • - reintroduce system into service cluster

Event Timeline

RobH added a parent task: Unknown Object (Task).Oct 28 2020, 8:14 PM

I'm unable to pxe boot this host. It doesn't display much of anything, just hangs for a while and then fails over to hdd.

The same behavior is on cloudvirt1026; I haven't tested 27. 28, 29, or 30 yet.

So the checklist wasn't updated for this task, but I'm told its been moved, with its ports setup and puppet updated. However, it is failing to boot and I'm doublechecking all settings and steps.

So I'll be checking off steps that were previously done by onsites.

Change 644881 had a related patch set uploaded (by RobH; owner: RobH):
[operations/puppet@production] updating cloudvirt1025 mac address

https://gerrit.wikimedia.org/r/644881

Change 644881 merged by RobH:
[operations/puppet@production] updating cloudvirt1025 mac address

https://gerrit.wikimedia.org/r/644881

Change 644881 merged by RobH:
[operations/puppet@production] updating cloudvirt1025 mac address

https://gerrit.wikimedia.org/r/644881

mac was wrong in puppet, fixed, now handing off to andrew for reimage via script. I was able to PXE boot, but then powered off. Host is currently powered down. Needs reimage and redeployment.

RobH claimed this task.

I'm closing this onsite task and creating a reimage task.