This is purely to track labs instances hanging with messages about hung tasks in kernel / console log. There is a subtask: T140256 for tools k8s and T141017 for k8s worker nodes. T124133 also has more information.
| Instance | Image | virt host | time | ssh | iowait | io spike | graphite data | nfs | recovery
| deployment-logstash2 | Jessie | labvirt1010 | 27 jul 09:00Z | no | no | no | no | no | ?
| tools-merlbot-proxy | Jessie | labvirt1010 | 31 jul 01:00Z | no | yes | regular spikes to 300 (typically 50) | no | yes? | rebooted
| novaproxy-01 | debian-8.1-jessie (deprecated 2016-01-12) | labvirt1001 | 2 Aug ~21:15Z | no | no | no | no | yes | reboot
| librarybase-reston-01 | Jessie | labvirt1006 | 07 Jul ~0200Z | no | no | no | no | no | reboot
#### Legend ####
Instance: Name of instance
virt host: the virt host the instance is hosted in (you can find this info on wikitech.wikimedia.org/wiki/Nova_Resource:$fqdn)
time: Time at which the host hung
ssh: If you could ssh into it still
iowait: If the host had high iowait (you can find this from https://graphite-labs.wikimedia.org/), e.g. tools.tools-merlbot-proxy.cpu.total.iowait
--> https://graphite-labs.wikimedia.org/render/?width=586&height=308&_salt=1469982906.471&target=tools.tools-merlbot-proxy.cpu.total.iowait
iospike: If the host had io spikes (you can find this from https://graphite-labs.wikimedia.org/), e.g. tools.tools-merlbot-proxy.iostat.vda.io. vdb, ..., might also exist?
--> https://graphite-labs.wikimedia.org/render/?width=586&height=308&_salt=1469983050.092&target=tools.tools-merlbot-proxy.iostat.vda.io
graphite data: If the host is sending graphite data even after it hung
nfs: If the host had NFS mounted