Page MenuHomePhabricator

reimage/pxe boot failing on cloudvirt1028
Closed, ResolvedPublic

Description

I'm trying to reimage cloudvirt1028 which a slightly different puppet role and can't get it to pxe boot.

I've tried the sre.hosts.reimage cookbook but have also tried pxe booting via the bios setup and either way I just get a long hang and then a fall back to hdd boot.

Due to failed reimage this host is probably in a fairly messed-up state in netbox and puppetdb.

Event Timeline

I need to stop working on this for the evening but it might be of interest to @Volans and/or @RobH

Volans added a subscriber: Cmjohnson.

I think this might be the same of T296856, and the host in need of a firmware upgrade. Adding DCOps.

btw this host is currently out of service and invisible to icinga so it can be rebooted whenever.

updated the firmware of the bios, both network cards, raid controller, and backplane. system booted back into the existing os.

Volans suggests that this might be related to cloudgw changes. @aborrero, @cmooney, @ayounsi would you expect dhcp to still work normally on cloudvirts?

The host keep failing and from the logs on the install server the pxelinux is never offered. Looking at the DHCP logs I found them not following the usual offer workflow:

Dec  2 01:55:44 install1003 dhcpd[19332]: DHCPDISCOVER from bc:97:e1:a7:3b:d8 via 10.64.20.3
Dec  2 01:55:44 install1003 dhcpd[19332]: DHCPOFFER on 10.64.20.52 to bc:97:e1:a7:3b:d8 via 10.64.20.3
Dec  2 01:55:44 install1003 dhcpd[19332]: DHCPDISCOVER from bc:97:e1:a7:3b:d8 via 10.64.20.2
Dec  2 01:55:44 install1003 dhcpd[19332]: DHCPOFFER on 10.64.20.52 to bc:97:e1:a7:3b:d8 via 10.64.20.2
[..SNIP...]
Dec  2 01:55:48 install1003 dhcpd[19332]: DHCPREQUEST for 10.64.20.52 (208.80.154.32) from bc:97:e1:a7:3b:d8 via 10.64.20.3
Dec  2 01:55:48 install1003 dhcpd[19332]: DHCPACK on 10.64.20.52 to bc:97:e1:a7:3b:d8 via 10.64.20.3
Dec  2 01:55:48 install1003 dhcpd[19332]: DHCPREQUEST for 10.64.20.52 (208.80.154.32) from bc:97:e1:a7:3b:d8 via 10.64.20.2: unknown lease 10.64.20.52.

To be noted that the host is behind one of the cloudsw switch. @ayounsi , @cmooney do you think that this could be the culprit and maybe option 82 has some issues with those switches configuration?

The other option is to retry another reimage while keeping a tcpdump on the install server to see the DHCP packets in their entirety.

To be noted that the host is behind one of the cloudsw switch. @ayounsi , @cmooney do you think that this could be the culprit and maybe option 82 has some issues with those switches configuration?

The cloudsw is configured to insert the option 82 data into DHCP requests the same as the other ASWs. The requests in the above logs have been forwarded by cr1 and cr2 core routers, both of which receive a copy of the original DHCP broadcast frame. The install servers are correctly configured to match the data that it should be getting added.

The other option is to retry another reimage while keeping a tcpdump on the install server to see the DHCP packets in their entirety.

I think that would be a wise next step, I'll chat to Andrew see if we can do that, might shed more light on what's happening.

Looks like the DHCP server is returning what it should.

But the host keeps sending further DHCP requests, so not sure if the replies are making it. PCAP file attached.

@RobH @wiki_willy sounds like this is back in your court. Cathal and I are out of ideas :(

Ok the traffic is being relayed by cr1-eqiad at least:

cmooney@re0.cr1-eqiad> monitor traffic interface xe-3/0/4.1118 matching "port 67 or port 68" no-resolve 
verbose output suppressed, use <detail> or <extensive> for full protocol decode
Address resolution is OFF.
Listening on xe-3/0/4.1118, capture size 96 bytes

17:39:20.179348  In IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from bc:97:e1:a7:3b:d8, length 668
17:39:20.180766 Out IP truncated-ip - 424 bytes missing! 10.64.20.2.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 446
17:39:24.303601  In IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from bc:97:e1:a7:3b:d8, length 601
17:39:28.539174  In IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from bc:97:e1:a7:3b:d8, length 668
17:39:28.540573 Out IP truncated-ip - 424 bytes missing! 10.64.20.2.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 446
17:39:32.646477  In IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from bc:97:e1:a7:3b:d8, length 601
17:39:41.191672  In IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from bc:97:e1:a7:3b:d8, length 668
17:39:41.193099 Out IP truncated-ip - 424 bytes missing! 10.64.20.2.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 446
17:39:45.512819  In IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from bc:97:e1:a7:3b:d8, length 601

cr2-eqiad gets the requests, and is forwarding them as we can be seen below, but I don't see the replies:

cmooney@re0.cr2-eqiad> monitor traffic interface xe-3/0/4.1118 matching "port 67 or port 68" no-resolve 
verbose output suppressed, use <detail> or <extensive> for full protocol decode
Address resolution is OFF.
Listening on xe-3/0/4.1118, capture size 96 bytes

17:39:20.164743  In IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from bc:97:e1:a7:3b:d8, length 614
17:39:24.282603  In IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from bc:97:e1:a7:3b:d8, length 613
17:39:28.521976  In IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from bc:97:e1:a7:3b:d8, length 614
17:39:32.626413  In IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from bc:97:e1:a7:3b:d8, length 613
17:39:41.076501  In IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from bc:97:e1:a7:3b:d8, length 614
17:39:45.264026  In IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from bc:97:e1:a7:3b:d8, length 613
17:40:01.357401  In IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from bc:97:e1:a7:3b:d8, length 614
17:40:05.481057  In IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from bc:97:e1:a7:3b:d8, length 613

Given the fact the install server is getting requests from both CRs, and sending replies to both, I'm not sure why cr2-eqiad does not seem to be relaying the replies. Of course if cr1-eqiad is doing so the overall process should still work.

Not sure if this has any bearing on the above, but trying to ping each CR from install1003 I note that cr2-eqiad is blocking the ping:

cmooney@install1003:~$ ping 10.64.20.2
PING 10.64.20.2 (10.64.20.2) 56(84) bytes of data.
64 bytes from 10.64.20.2: icmp_seq=1 ttl=64 time=0.424 ms
64 bytes from 10.64.20.2: icmp_seq=2 ttl=64 time=0.454 ms
cmooney@install1003:~$ ping 10.64.20.3
PING 10.64.20.3 (10.64.20.3) 56(84) bytes of data.
From 10.64.20.3 icmp_seq=1 Packet filtered
From 10.64.20.3 icmp_seq=2 Packet filtered

labe-v4 seems to be the only relevant filter, and I can't see a difference on either that would affect it.

@Dzahn points out that it could be that dhcp is working but preseed is failing

I should have realised this on Friday, but I think we can be sure the DHCP responses from install1003 are making it to the host.

Initially the host sends a DHCP DISCOVER message to obtain an IP, and install1003 responds:

Dec  2 01:55:44 install1003 dhcpd[19332]: DHCPDISCOVER from bc:97:e1:a7:3b:d8 via 10.64.20.2
Dec  2 01:55:44 install1003 dhcpd[19332]: DHCPOFFER on 10.64.20.52 to bc:97:e1:a7:3b:d8 via 10.64.20.2

For whatever reason, a few seconds later, the host sends another DHCP packet, but this is a DHCP REQUEST, specifically asking to renew the IP it was allocated in the first exchange:

Dec  2 01:55:48 install1003 dhcpd[19332]: DHCPREQUEST for 10.64.20.52 (208.80.154.32) from bc:97:e1:a7:3b:d8 via 10.64.20.3
Dec  2 01:55:48 install1003 dhcpd[19332]: DHCPACK on 10.64.20.52 to bc:97:e1:a7:3b:d8 via 10.64.20.3

The fact this second packet from the host contains the assigned IP shows that the initial DHCP response from install1003 was received. Why the host does not properly go into PXE boot, or why it tries to renew the DHCP lease so soon, I am not sure. But I think this shows the DHCP responses from install1003 are getting back.

Try if the server can talk http to apt1001.wikimedia.org / apt2001.wikimedia.org.

After getting an IP from DHCP but before starting the Debian installer it needs to fetch:

 preseed/url=http://apt.wikimedia.org/autoinstall/preseed.cfg
..
auto-install/enable=true preseed/url=http://apt.wikimedia.org/autoinstall/preseed.cfg

then

preseed/early_command	string	wget -O /tmp/early_command http://apt.wikimedia.org/autoinstall/scripts/early_command.sh

etc ... and eventually the actual installer

option pxelinux.pathprefix "http://apt.wikimedia.org/tftpboot/bullseye-installer/";

edit: actually just apt1001 matters currently:

apt.wikimedia.org has address 208.80.154.30

30.154.80.208.in-addr.arpa domain name pointer apt1001.wikimedia.org.

also might want to check logs on apt1001, it should be nginx there, around this:

[apt1001:/var/log/nginx] $ grep preseed *.log
access.log:10.192.32.142 - - [06/Dec/2021:10:05:00 +0000] "GET /autoinstall/preseed.cfg HTTP/1.1" 200 18815 "-" "debian-installer"

would do a "tail -f" on /var/log/nginx/*.log on apt1001 (in addition to one for DHCP on install1003) while trying another install

then see how far it gets.

maybe check apt2001 as well for good measure, even though I don't think it would use that for codfw

@Dzahn Thanks for the pointers. There is one log entry which seems to be from the affected host requesting the URL that is being returned in DHCP

10.64.20.52 - - [06/Dec/2021:11:06:07 +0000] "GET /tftpboot/buster-installer/ HTTP/1.1" 200 803 "-" "curl/7.64.0"

Interestingly that's from earlier today, not sure if someone tried to re-image the host earlier. I don't see any equivalent logs for Friday evening when we attempted it multiple times.

I think probably the best next-step is to try the reimage again and have someone from Infra Foundations monitor the console to see if there is any info there that might be relevant. I'll see if we can proceed to do that.

Ok well despite what I said earlier I can confirm the DHCP is failing. It isn't visible on the serial console, but via the virtual monitor port you can see it times out at DHCP stage: (I took a video of this but couldn't find good way to link it here):

image.png (571×954 px, 222 KB)

Ok so on the back of that I wanted to see what was actually going on host side when it made a DHCP request. So when it booted back up I logged on via console and removed the hard-coded IPv4 address from eno1np0 manually using 'ip', and then manually initiated dhclient to obtain an IP (after making sure install1003 was in the right state to return an IP). When I did this I could see the DHCP requests working, but for some reason the server went into an unending loop, sending multiple DHCPREQUESTs after it received the original offer:

root@cloudvirt1028:~# dhclient -v eno1np0
Internet Systems Consortium DHCP Client 4.4.1
Copyright 2004-2018 Internet Systems Consortium.
All rights reserved.
For info, please visit https://www.isc.org/software/dhcp/

Listening on LPF/eno1np0/bc:97:e1:a7:3b:d8
Sending on   LPF/eno1np0/bc:97:e1:a7:3b:d8
Sending on   Socket/fallback
DHCPDISCOVER on eno1np0 to 255.255.255.255 port 67 interval 8
DHCPOFFER of 10.64.20.52 from 10.64.20.2
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPDISCOVER on eno1np0 to 255.255.255.255 port 67 interval 6
DHCPOFFER of 10.64.20.52 from 10.64.20.2
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPDISCOVER on eno1np0 to 255.255.255.255 port 67 interval 6
DHCPOFFER of 10.64.20.52 from 10.64.20.2
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPDISCOVER on eno1np0 to 255.255.255.255 port 67 interval 3
DHCPOFFER of 10.64.20.52 from 10.64.20.2
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPDISCOVER on eno1np0 to 255.255.255.255 port 67 interval 8
DHCPOFFER of 10.64.20.52 from 10.64.20.2
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPDISCOVER on eno1np0 to 255.255.255.255 port 67 interval 5
DHCPOFFER of 10.64.20.52 from 10.64.20.2
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPDISCOVER on eno1np0 to 255.255.255.255 port 67 interval 3
DHCPOFFER of 10.64.20.52 from 10.64.20.2
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPDISCOVER on eno1np0 to 255.255.255.255 port 67 interval 5
DHCPOFFER of 10.64.20.52 from 10.64.20.2
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPDISCOVER on eno1np0 to 255.255.255.255 port 67 interval 6
DHCPOFFER of 10.64.20.52 from 10.64.20.2
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPDISCOVER on eno1np0 to 255.255.255.255 port 67 interval 7
DHCPOFFER of 10.64.20.52 from 10.64.20.2
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPDISCOVER on eno1np0 to 255.255.255.255 port 67 interval 5
DHCPOFFER of 10.64.20.52 from 10.64.20.2
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPDISCOVER on eno1np0 to 255.255.255.255 port 67 interval 7
DHCPOFFER of 10.64.20.52 from 10.64.20.2
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPDISCOVER on eno1np0 to 255.255.255.255 port 67 interval 4
DHCPOFFER of 10.64.20.52 from 10.64.20.2
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPDISCOVER on eno1np0 to 255.255.255.255 port 67 interval 5
DHCPOFFER of 10.64.20.52 from 10.64.20.2
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPDISCOVER on eno1np0 to 255.255.255.255 port 67 interval 3
DHCPOFFER of 10.64.20.52 from 10.64.20.2
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPDISCOVER on eno1np0 to 255.255.255.255 port 67 interval 8
DHCPOFFER of 10.64.20.52 from 10.64.20.2
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPDISCOVER on eno1np0 to 255.255.255.255 port 67 interval 3
DHCPOFFER of 10.64.20.52 from 10.64.20.2
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPDISCOVER on eno1np0 to 255.255.255.255 port 67 interval 5
DHCPOFFER of 10.64.20.52 from 10.64.20.2
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPDISCOVER on eno1np0 to 255.255.255.255 port 67 interval 6
DHCPOFFER of 10.64.20.52 from 10.64.20.2
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPDISCOVER on eno1np0 to 255.255.255.255 port 67 interval 4
DHCPOFFER of 10.64.20.52 from 10.64.20.2
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPDISCOVER on eno1np0 to 255.255.255.255 port 67 interval 4
DHCPOFFER of 10.64.20.52 from 10.64.20.2
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPDISCOVER on eno1np0 to 255.255.255.255 port 67 interval 4
DHCPOFFER of 10.64.20.52 from 10.64.20.2
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPDISCOVER on eno1np0 to 255.255.255.255 port 67 interval 7
DHCPOFFER of 10.64.20.52 from 10.64.20.2
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
DHCPREQUEST for 10.64.20.52 on eno1np0 to 255.255.255.255 port 67
^C
root@cloudvirt1028:~#

I eventually CTRL-C'd out of that, and sure enough the interface had not configured itself with an IP:

root@cloudvirt1028:~# ip -br addr show eno1np0
eno1np0          UP             2620:0:861:118:10:64:20:52/64 fe80::10:64:20:52/64 fe80::be97:e1ff:fea7:3bd8/64

Looking at the packet captures either side (install1003 and cloudvirt1028) they packets are they same. I realise, however, that the PXElinux prefix is missing. So clearly what I did on install1003 to replicate the DHCP server config during the reimage process was insufficient. I will discuss with @Volans tomorrow about how to best replicate that and try this again.

Try if the server can talk http to apt1001.wikimedia.org / apt2001.wikimedia.org.

After getting an IP from DHCP but before starting the Debian installer it needs to fetch:

The host was not even getting pxelinux.0 served by the install server (missing log lines in syslog on the install server). So it's failing way before that.
For the record the host can perfectly reach apt.wikimedia.org from the current OS.

Looking at the packet captures either side (install1003 and cloudvirt1028) they packets are they same. I realise, however, that the PXElinux prefix is missing. So clearly what I did on install1003 to replicate the DHCP server config during the reimage process was insufficient. I will discuss with @Volans tomorrow about how to best replicate that and try this again.

For this kind of debug you can just use the coobook:

sre.hosts.dhcp: Set the ephemeral DHCP for a given host, then give control to the user and clear the DHCP config on exit.

Mentioned in SAL (#wikimedia-operations) [2021-12-07T11:31:47Z] <topranks> removing IP addressing on cloudvirt1028 manually and forcing DHCP to debug reimage failure (T296906)

Change 744854 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/homer/public@master] Allow cloud-hosts1-eqiad DHCP responses to eqiad CRs

https://gerrit.wikimedia.org/r/744854

@Volans many thanks for the info, that is super handy :)

Using that cookbook I got the same results as my previous attempt. I needed a refresh on the normal DHCP message flow, after which I was able to see where the problem was.

DHCP messages

First of all consider the normal DHCP message exchange:

https://upload.wikimedia.org/wikipedia/commons/e/e4/DHCP_session.svg

The full exchange doesn't complete until the client indicates it will accept the servers DHCP OFFER, by sending a DHCP REQUEST back to the server. After this the server responds to the client with a DHCP ACK completing the process.

What we see

Going back to the pcaps in my previous comment you can see the DHCP ACKs are sent by install1003, but none are received on cloudvirt1028. What is also clear is that client requests are effectively duplicated, as both CR routers receive the host's broadcasts, and both relay requests on towards install1003.

That is the case for both the DHCP DISCOVER and DHCP REQUEST packets cloudvirt1028 sends. What is different is that dhcpd (running on install1003), responds to every DHCP DISCOVER message it receives, but only responds to the first of every pair of DHCP REQUEST packets. i.e. when it gets a DHCP REQUEST packet it ignores the duplicate that arrives a moment later from the other CR. I have spent a little while looking at the ISC DHCPd docs to try and validate this is the normal behaviour, but I didn't see it mentioned. Either way it's clearly happening so I did not dwell on it.

The fact the DHCP OFFER packets do make it back to the host threw me off, hence my comment above saying the DHCP was getting back, when in fact the ACKs were not.

Why that breaks things

Cloudvirt1028 is directly connected to cloudsw1-d5-eqiad. Cloudsw1-d5-eqiad is connected to cr2-eqiad directly, so it forwards broadcasts from the host over that port to cr2-eqiad. There is also a path from that switch, via cloudsw1-c8-eqiad, to cr1-eqiad in the same vlan. So every broadcast is received by both CRs, and realyed to the install server. The longer path to get to CR1 means, however, that the request hits CR2 first every single time. As observed in previous comments here and here, we know that the DHCP response packets from install1003 are not making it back if they were relayed by cr2-eqiad.

The reasons for that are detailed in the Gerrit patch I've submitted to correct the problem. To cut a long story short the return traffic is hitting cr1-eqiad first, and then getting sent out its directly connected link to cloud-hosts1-eqiad to get to cr2-eqiad. The packets thus hit a filter designed for traffic coming from cloud servers into the CRs, rather than for traffic coming from install1003, and it get dropped. The patch adds a term to the filter to specifically allow these responses.

Test

Following a test implementation I can confirm that DHCP is now working for the host and it is entering PXEboot sucessfully :)

image0.jpeg (1×2 px, 784 KB)

cmooney@apt1001:~$ tail -10000 /var/log/nginx/access.log | grep 10.64.20.52
10.64.20.52 - - [07/Dec/2021:19:31:03 +0000] "GET /autoinstall/preseed.cfg HTTP/1.1" 200 18808 "-" "debian-installer"
10.64.20.52 - - [07/Dec/2021:19:31:05 +0000] "GET /autoinstall/common.cfg HTTP/1.1" 200 2962 "-" "debian-installer"
10.64.20.52 - - [07/Dec/2021:19:31:06 +0000] "GET /autoinstall/buster.cfg HTTP/1.1" 200 560 "-" "debian-installer"
10.64.20.52 - - [07/Dec/2021:19:31:07 +0000] "GET /autoinstall/passwd.cfg HTTP/1.1" 200 443 "-" "debian-installer"
10.64.20.52 - - [07/Dec/2021:19:31:07 +0000] "GET /autoinstall/override.cfg HTTP/1.1" 200 504 "-" "debian-installer"
10.64.20.52 - - [07/Dec/2021:19:31:07 +0000] "GET /autoinstall/subnets/labs-hosts1-b-eqiad.cfg HTTP/1.1" 200 450 "-" "debian-installer"
10.64.20.52 - - [07/Dec/2021:19:31:09 +0000] "GET /autoinstall/scripts/early_command.sh HTTP/1.1" 200 1681 "-" "Wget/1.20.1 (linux-gnu)"
10.64.20.52 - - [07/Dec/2021:19:31:27 +0000] "GET /autoinstall/ssh/authorized_keys HTTP/1.1" 200 730 "-" "Wget/1.20.1 (linux-gnu)"

There is, of course, another issue with the reimage, in that it's not getting past the disk partitioning stage, but I will discuss that in a separate comment.

I am fine with wrangling with the disk partitioning pieces if you don't feel like it; IIRC the cloudvirts often prompt for a keypress at some point during install but otherwise succeed

@Andrew thanks yeah. I have the screen open here still and can do that if you wish:

partition_stuck.png (438×722 px, 56 KB)

I suspected the issue may be that the server wasn't matching any of the patterns in the below file, but I can see it seems to be accounted for:

https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/modules/install_server/files/autoinstall/netboot.cfg#178

Given that's the case I'm sort of our of ideas. I'll follow up with you on irc on whether to manually confirm.

I don't much care about having to click through the partman step but imaging still fails. Now it stalls on

'Attempt to run 'cookbooks.sre.hosts.reimage.ReimageRunner._populate_puppetdb.<locals>.poll_puppetdb' raised: Nagios_host resource with title cloudvirt1028 not found yet'

Change 744909 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] site.pp: fix a very important typo re: cloudvirt1028

https://gerrit.wikimedia.org/r/744909

Change 744909 merged by Andrew Bogott:

[operations/puppet@production] site.pp: fix a very important typo re: cloudvirt1028

https://gerrit.wikimedia.org/r/744909

Change 744854 merged by Cathal Mooney:

[operations/homer/public@master] Allow cloud-hosts1-eqiad DHCP responses to eqiad CRs

https://gerrit.wikimedia.org/r/744854

I don't much care about having to click through the partman step but imaging still fails. Now it stalls on

'Attempt to run 'cookbooks.sre.hosts.reimage.ReimageRunner._populate_puppetdb.<locals>.poll_puppetdb' raised: Nagios_host resource with title cloudvirt1028 not found yet'

@Andrew The puppettization for the hosts should use the correct partman that should perform automatic partitioning without human interaction. If that's not the case it needs to be fixed. AFAICT the current setup in netboot.cfg for this host should match:

cloudvirt102[1-9]) echo partman/custom/labvirt-backy-ssd.cfg ;; \