Currently,
- tools-exec-1401 and
- tools-exec-catscan
are effectively offline due to DNS issues. Their queues
valhallasw@tools-bastion-01:~$ qstat -f | grep 'exec-1401\|catscan' mailq@tools-exec-1401.tools.eq BP 0/0/5 -NA- -NA- au task@tools-exec-1401.tools.eqi BIP 0/0/50 -NA- -NA- au continuous@tools-exec-1401.too BC 0/0/50 -NA- -NA- au catscan@tools-exec-catscan.too BIC 0/0/1000 -NA- -NA- au
are all in au (alarm, unknown) state. The cause of this is that SGE actually knows the hosts as
valhallasw@tools-bastion-01:/data/project/admin/public_html$ qhost -h tools-exec-1401 tools-exec-catscan HOSTNAME ARCH NCPU LOAD MEMTOT MEMUSE SWAPTO SWAPUS ------------------------------------------------------------------------------- global - - - - - - - tools-exec-1401.eqiad.wmflabs lx26-amd64 4 0.01 7.8G 234.0M 23.9G 0.0 tools-exec-catscan.eqiad.wmflabs lx26-amd64 4 0.01 7.8G 233.7M 1.9G 0.0
i.e. without .tools. in the hostname, while the queues do have .tools. in the hostname.
See also T109485: Remove modules/toollabs/files/host_aliases