unattended upgrades is trying to update initramfs-tools but there's a race in the package which causes dpkg to hang sometimes.
This has broken dpkg, and apt, and puppet in many places.
unattended upgrades is trying to update initramfs-tools but there's a race in the package which causes dpkg to hang sometimes.
This has broken dpkg, and apt, and puppet in many places.
I've seen some hosts stuck in unattended-upgrade because of NFS. At some point dpkg calls sync to flush filesystems and everything stalls. The only option I've found is to hard reboot them.
Ran cumin 'P{F:lsbdistcodename = jessie}' 'ps auxwf | grep -v grep | grep dpkg' on deployment-cumin, no dpkg processes stuck running on deployment-prep's 34 jessie instances.
tools-worker-1018:~$ sudo dpkg --configure -a
Setting up initramfs-tools (0.120+deb8u3) ...
update-initramfs: deferring update (trigger activated)
Processing triggers for initramfs-tools (0.120+deb8u3) ...
update-initramfs: Generating /boot/initrd.img-4.9.0-0.bpo.6-amd64
It's sure taking its time on the actual generating there.
After a hard reboot, I was able to get it to run puppet, but I was surprised at how many files it thought needed changes (for the most part the changes aren't actual content for that matter). P8212
The DNS server is an actual change.
Sometime when it's not the weekend let's audit all instances for stuck dpkg processes. This might be happening all over the place.
Mentioned in SAL (#wikimedia-cloud) [2019-03-17T17:46:12Z] <bstorm_> depooling tools-worker-1009 and tools-worker-1012 for T218514
Mentioned in SAL (#wikimedia-cloud) [2019-03-17T17:48:10Z] <bstorm_> T218514 rebooting tools-worker-1009 and 1012
This seems to be fine now. I double-checked the state of apt and dpkg and although there are a few things stuck from race conditions there's nothing comprehensive or serious going on.