On cache_text nodes, there is one ET_NET ats-be thread doing much more work than all others:
root@cp3052:~# top -b -H -n1 -p `ps xa | awk '/traffic_ser[v]er -M --httpport 3128/ {print $1}'` | head -n 20 top - 12:43:09 up 37 days, 19:53, 1 user, load average: 12.62, 12.42, 11.68 Threads: 89 total, 1 running, 88 sleeping, 0 stopped, 0 zombie %Cpu(s): 14.6 us, 5.5 sy, 0.0 ni, 77.8 id, 0.1 wa, 0.0 hi, 2.0 si, 0.0 st KiB Mem : 39465862+total, 84766352 free, 30453852+used, 5353772 buff/cache KiB Swap: 0 total, 0 free, 0 used. 82635152 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 250345 traffic+ 20 0 16.812g 0.013t 11492 R 86.7 3.6 52:31.47 [ET_NET 58] 250288 traffic+ 20 0 16.812g 0.013t 11492 S 6.7 3.6 34:43.15 [ET_NET 1] 250289 traffic+ 20 0 16.812g 0.013t 11492 S 6.7 3.6 36:20.96 [ET_NET 2] 250291 traffic+ 20 0 16.812g 0.013t 11492 S 6.7 3.6 51:11.93 [ET_NET 4] 250296 traffic+ 20 0 16.812g 0.013t 11492 S 6.7 3.6 34:20.67 [ET_NET 9] 250297 traffic+ 20 0 16.812g 0.013t 11492 S 6.7 3.6 35:57.68 [ET_NET 10] 250304 traffic+ 20 0 16.812g 0.013t 11492 S 6.7 3.6 74:05.74 [ET_NET 17] 250306 traffic+ 20 0 16.812g 0.013t 11492 S 6.7 3.6 34:29.38 [ET_NET 19] 250309 traffic+ 20 0 16.812g 0.013t 11492 S 6.7 3.6 47:52.87 [ET_NET 22] 250319 traffic+ 20 0 16.812g 0.013t 11492 S 6.7 3.6 35:55.25 [ET_NET 32] 250322 traffic+ 20 0 16.812g 0.013t 11492 S 6.7 3.6 61:50.29 [ET_NET 35] 250323 traffic+ 20 0 16.812g 0.013t 11492 S 6.7 3.6 79:00.64 [ET_NET 36] 250324 traffic+ 20 0 16.812g 0.013t 11492 S 6.7 3.6 34:17.38 [ET_NET 37] `
Observing the behavior with top -H -p $(ps xa | awk '/traffic_ser[v]er -M --httpport 3128/ {print $1}'), it becomes clear that ET_NET 58 in the example above is consistently using a disproportionate amount of CPU compared to all other threads, even getting somehow close to 90% at times.
The cause of this is purge traffic sent by vhtcpd: restarting it makes another ET_NET thread become the "busy" one.
Given the current architecture of vhtcpd, we are sending high amounts of requests per second (~5k at the time of this writing) over one single TCP connection. It might be a good idea to explore alternative approaches, such as having vhtcpd spread HTTP requests over a pool of TCP connections instead to spread the load more evenly.