Some instances are suffering from CPU lockdown / slowness that make them rather unresponsive.
An example is [[ https://wikitech.wikimedia.org/wiki/Nova_Resource:I-000006d8.eqiad.wmflabs | deployment-parsoid05 ]] which runs on virt1007 [[ http://ganglia.wikimedia.org/latest/?r=day&cs=&ce=&m=cpu_report&c=Virtualization+cluster+eqiad&h=virt1007.eqiad.wmnet&tab=m&vn=&hide-hf=false&mc=2&z=medium&metric_group=ALLGROUPS | Ganglia view ]]. The server has spikes of WIO but is otherwise low on CPU usage.
A dmesg T97421#1244842 shows a `BUG: soft lockup - CPU#1 stuck for 23s! [nodejs:27358]` error.
That is tracked on {T97421}
@petrb reported a similar issue on [[ https://wikitech.wikimedia.org/wiki/Nova_Resource:I-00000bc2.eqiad.wmflabs | huggle-d2 ]] which runs on labvirt1005 ([[ http://ganglia.wikimedia.org/latest/?r=day&cs=&ce=&m=cpu_report&c=Virtualization+cluster+eqiad&h=labvirt1005.eqiad.wmnet&tab=m&vn=&hide-hf=false&mc=2&z=medium&metric_group=ALLGROUPS | ganglia view ]]).