Today we encountered a condition where PXE boot was broken (system would hang with a blank screen after pressing F12) and the cause was a crashed tftpd on install1002. Creating a task to implement some basic monitoring of the tftp service to avoid similar surprises in the future.
Related Gerrit Patches:
|operations/puppet : production||installserver: add monitoring for TFTP|
|operations/puppet : production||install_server: add Icinga monitoring for TFTP service|
- converted puppet role to profile
- re-added monitoring section to profile (now the style check is happy about that)
- appears here now again: https://icinga.wikimedia.org/cgi-bin/icinga/status.cgi?search_string=tftp