Task to collect information on kubernetes worker nodes hanging. For each node, collect:
- Time when it started to hang
- Symptoms from graphite (high iowait? high CPU usage?)
- Labvirt node it was on
- If ssh was accessible
- Snippets from console log
- How it was 'fixed'
Table:
| date | Node | labvirt | iowait | ssh | fixtype |
|---|---|---|---|---|---|
| 7/21 | tools-worker-1018 | labvirt1010 | yes | no | reboot |
| 7/22 | tools-worker-1015 | labvirt1010 | yes | no | reboot |
| 7/25 | tools-worker-1005 | labvirt1004 | yes | yes | destroyed with fire |






