When the tools NFS filesystem filled up and needed restarting (see T217068 for the timing), the old Ubuntu Trusty grid running Sun Grid Engine 6 had few errors. The newer Debian Stretch Son of Grid Engine grid lost at least 19 nodes, all in a hung state that required a reboot from within Openstack to recover them (soft reboot was generally sufficient).
This is to track and find out why the new grid is even more fragile on NFS errors than the old one.