Today I ran into an issue where the host I wmf-auto-reimage-host'd failed to reboot into PXE. Understanding the problem wasn't immediate to me, namely the user output was this
13:24:07 | ms-be2057.codfw.wmnet | Still waiting for reboot after 20.0 minutes 13:29:10 | ms-be2057.codfw.wmnet | Still waiting for reboot after 25.0 minutes 13:34:12 | ms-be2057.codfw.wmnet | Still waiting for reboot after 30.0 minutes 13:39:18 | ms-be2057.codfw.wmnet | Still waiting for reboot after 35.0 minutes
Whereas cumin's log file mentioned that cat /proc/uptime failed
PASS | | 0% (0/1) [00:00<?, ?hosts/s] FAIL |██████████| 100% (1/1) [00:00<00:00, 17.81hosts/s] 100.0% (1/1) of nodes failed to execute command 'cat /proc/uptime': ms-be2057.codfw.wmnet 100.0% (1/1) of nodes failed to execute command 'cat /proc/uptime': ms-be2057.codfw.wmnet 0.0% (0/1) success ratio (< 100.0% threshold) of nodes successfully executed all commands. Aborting. 0.0% (0/1) success ratio (< 100.0% threshold) of nodes successfully executed all commands. Aborting.
But running cumin interactively worked, and this is because cumin is executed with a debian-installer specific key.