Looks like es1019 has an issue with the IPMI.
Per icinga:
ipmi_sdr_cache_open: /root/.freeipmi/sdr-cache/sdr-cache-es1019.localhost: internal IPMI error
I wanted to soft restart the IPMI, but I cannot access it, it looks down.
ssh es1019.mgmt.eqiad.wmnet -lroot channel 0: open failed: connect failed: Connection timed out root@cumin1001:~# ping es1019.mgmt.eqiad.wmnet PING es1019.mgmt.eqiad.wmnet (10.65.4.44) 56(84) bytes of data. ^C --- es1019.mgmt.eqiad.wmnet ping statistics --- 128 packets transmitted, 0 received, 100% packet loss, time 130024ms
I am not sure if the IPMI on-site reseat needs the host to go fully down, if so, please coordinate with me, as we'd need to depool it.
This is not the first occurrence on this host:
T201132
T187530
T213422
T155691