@MoritzMuehlenhoff updated the netinst image for the latest buster point release, i've run puppet on install*, i've run sudo -u mirror ftpsync on sodium, but i'm still getting "No kernel modules found":
Description
Details
Project | Branch | Lines +/- | Subject | |
---|---|---|---|---|
operations/puppet | production | +19 -0 | aptrepo: populate /srv/tftpboot from volatile also on APT_repo servers |
Related Objects
Event Timeline
From d-i syslog:
May 11 09:11:07 anna[5770]: WARNING **: no packages matching running kernel 4.19.0-8-amd64 in archive
It looks like maybe initrd.gz got updated, but not the kernel?
root@puppetmaster1001:/var/lib/puppet/volatile/tftpboot/buster-installer/debian-installer/amd64# ls -l total 119816 -rw-r--r-- 1 root root 1322936 May 4 19:14 bootnetx64.efi drwxrwxr-x 2 root root 4096 May 4 19:14 boot-screens drwxrwxr-x 3 root root 4096 May 4 19:14 grub -rw-r--r-- 1 root root 1254768 May 4 19:14 grubx64.efi -rw-rw-r-- 1 root root 114772537 May 11 07:17 initrd.gz -rw-r--r-- 1 root root 5278960 May 4 19:14 linux -rw-r--r-- 1 root root 42430 May 4 19:14 pxelinux.0 drwxrwxr-x 2 root root 4096 May 4 19:14 pxelinux.cfg
This seems caused by the separation of apt1001 and the new buster-based install servers; puppet updates /srv/tftpboot on install1003/2003, but probably the reimage by Kormat received a stale version of /srv/tftpboot from apt1001. Adding @Dzahn
I can confirm that /srv/tftpboot on apt1001 is stale:
kormat@apt1001:/srv/tftpboot/buster-installer/debian-installer/amd64(0:0)$ ls -l total 119796 -r--r--r-- 1 root root 1322936 Jun 27 2019 bootnetx64.efi dr-xr-xr-x 2 root root 4096 Feb 10 07:30 boot-screens dr-xr-xr-x 3 root root 4096 Nov 18 10:46 grub -r--r--r-- 1 root root 1254768 Jul 3 2019 grubx64.efi -r--r--r-- 1 root root 114759838 Feb 10 07:30 initrd.gz -r--r--r-- 1 root root 5270768 Feb 10 07:30 linux -r--r--r-- 1 root root 42430 Apr 9 2019 pxelinux.0 dr-xr-xr-x 2 root root 4096 Feb 5 2019 pxelinux.cfg
To unbreak current Buster installs it should be sufficient to replace /srv/tftpboot/buster-install on apt1001.wikimedia.org with a version from install1003 or install2003.
To fix this for good we can either
- have /srv/tftpboot on apt1001 be populated from the volatile directory
- introduce installserver aliases similar to the webproxy CNAME (it seems unclean to reuse this) and change d-i to fetch the install image from there
Change 595507 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] aptrepo: populate /srv/tftpboot from volatile also on APT_repo servers
Change 595507 merged by Dzahn:
[operations/puppet@production] aptrepo: populate /srv/tftpboot from volatile also on APT_repo servers
I did this one with the puppet patch above and /srv/tftpboot has been populated from volatile on apt1001 now.
I tested it on backup1002 and this worked well. This can be closed
- but I wonder if we should have a working group in improving the install and deb service, when it was split we discussed that the split was well intended, but it had some surprising consequences- and I think @Dzahn was surprised by this double dependency on these files. Maybe we can come up with a better split strategy?
I think ultimately we should serve the tftpboot environment from the install servers, especially once we add install* servers to the Ganeti clusters in the edge PoPs. But that needs some changes to the install roles, so that they also have a web server etc.
Closing per: T252382#6124358
If we want a further discussion on long-term solving, we can always create a new task.
Thanks!
Change 595892 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/dns@master] introduce sectools1001.eqiad.wmnet
Thanks!
- but I wonder if we should have a working group
I don't think that's needed. The discussion on the strategy was/is T242602 which can still be used.
And what Moritz said above " serve the tftpboot environment from the install servers, especially once we add install* servers to the Ganeti clusters in the edge PoPs" is already agreed on and will be implemented.