Page MenuHomePhabricator

Cookbook sre.hosts.reimage: DHCP snippet created with old IP when --move-vlan is used
Closed, ResolvedPublic

Description

I'm not sure 100% on the reason for this, but when reimaging a few hosts today in eqiad rows C/D, and using the --move-vlan argument to the reimage cookbook, the process did not complete correctly.

Investigating I could see the DHCP packets were arriving to install1005, however the server was not responding to the host with a DHCP OFFER. It seems that despite the cookbook sucessfully updating the host's IP in Netbox, wiping our dns caches and updating our auth dns, the old IP for the host was used in the DHCP snippet file created. So despite the Option 97 info matching dhcpd was not returning an IP as the gi-addr in the DISCOVER was on a different subnet (new vlan) than the configured IP in the config snippet.

Probably just a race condition we need to work out so the IP is re-discovered if move-vlan is used. For now aborting the reimage and running it again will cause a correct config snippet to be added and things should proceed.

Details

Event Timeline

cmooney triaged this task as Medium priority.

Change #1237151 had a related patch set uploaded (by Ayounsi; author: Ayounsi):

[operations/cookbooks@master] reimage: use the freshest IP fpr DHCP

https://gerrit.wikimedia.org/r/1237151

Change #1237151 merged by jenkins-bot:

[operations/cookbooks@master] reimage: use the freshest IP for DHCP

https://gerrit.wikimedia.org/r/1237151

ayounsi claimed this task.

fixed.