Maniphest T182070

tools-webgrid-lighttpd have ~ 90 procs stuck at 100% CPU time (mostly tools.jembot)
Closed, DuplicatePublic
Actions

Assigned To

None

Authored By

	hashar
	Dec 5 2017, 10:50 AM

Description

TLDR workaround

Here are the clush commands @bd808 has been using to first check for and then kill processes that have leaked out of grid engine due to some cleanup failure:

find-orphans

$ clush -w @exec -w @webgrid -b 'ps axwo user:20,ppid,pid,cmd | grep -Ev "^($USER|root|daemon|diamond|_lldpd|messagebus|nagios|nslcd|ntp|prometheus|statd|syslog|Debian-exim|www-data|sgeadmin)"|grep -v perl|grep -E "     1 "'

kill-orphans

$ clush -w @exec -w @webgrid -b 'ps axwo user:20,ppid,pid,cmd | grep -Ev "^($USER|root|daemon|diamond|_lldpd|messagebus|nagios|nslcd|ntp|prometheus|statd|syslog|Debian-exim|www-data|sgeadmin)"|grep -v perl|grep -E "     1 "|awk "{print \$3}"|xargs sudo kill -9'

While looking at CPU usage of instances on labs, I found out that tools-webgrid-lighttpd-* have high user CPU usage being reported. That seems to be leaking over time as shown for 5 months (CPU usage is number of CPU being busy):

lighttpd_cpu_usage.png (504×711 px, 89 KB)

Over a week:

lighttpd_cpu_usage_one_week.png (504×711 px, 108 KB)

It seems we have the equivalent of 90 CPU being stuck at 100%.

View of CPU usage for tools-webrid-lighttpd-* instances.

Based on:

# List proces per cputime, percent cpu, command and username
# Filter on cputime of 1 day or more and current cpu being at least 10%
cmd="ps -e -o cputime,pcpu,cmd,user|grep -P '^\d+-.* \d\d\.'|sort -n|sed -e 's%\(^\|\s\+\)% | %g'"
for i in $(seq 1401 1428); do
    ssh "tools-webgrid-lighttpd-$i.tools.eqiad.wmflabs" "$cmd" | sed -e "s%^%| $i %"
done

The top offender seems to be tools.jembot. As of January 4th 2018

Instance	Cumulative time	Utilization	Command	User
1401	20-14:46:57	99.2	/usr/bin/php-cgi	tools.jembot
1401	178-13:37:34	95.5	/usr/bin/php-cgi	tools.blockcalc
1404	20-12:40:45	99.8	/usr/bin/php-cgi	tools.jembot
1404	20-18:45:12	99.8	/usr/bin/php-cgi	tools.jembot
1404	170-12:48:17	91.2	/usr/bin/php-cgi	tools.wam
1405	20-22:20:24	99.8	/usr/bin/php-cgi	tools.jembot
1405	174-15:13:09	93.4	/usr/bin/php-cgi	tools.dupdet
1407	20-11:46:07	99.7	/usr/bin/php-cgi	tools.jembot
1407	21-02:11:31	99.6	/usr/bin/php-cgi	tools.jembot
1407	157-10:17:17	99.4	/usr/bin/php-cgi	tools.wsexport
1408	65-23:01:42	90.2	/usr/bin/php-cgi	tools.spellcheck
1408	69-02:01:07	90.5	/usr/bin/php-cgi	tools.spellcheck
1408	77-08:11:17	98.3	/usr/bin/php-cgi	tools.pinyin-wiki
1408	79-03:28:27	96.5	/usr/bin/php-cgi	tools.spellcheck
1409	43-06:46:25	72.3	/usr/bin/php-cgi	tools.jembot
1409	43-06:47:33	72.3	/usr/bin/php-cgi	tools.jembot
1409	43-06:49:12	72.3	/usr/bin/php-cgi	tools.jembot
1409	43-06:50:14	72.3	/usr/bin/php-cgi	tools.jembot
1410	7-05:27:30	48.8	/usr/bin/php-cgi	tools.jembot
1410	7-05:27:32	48.8	/usr/bin/php-cgi	tools.jembot
1410	7-05:27:41	48.8	/usr/bin/php-cgi	tools.jembot
1410	7-05:27:59	48.8	/usr/bin/php-cgi	tools.jembot
1410	20-13:12:38	98.3	/usr/bin/php-cgi	tools.jembot
1410	185-05:23:07	99.1	/usr/bin/php-cgi	tools.supercount
1411	7-12:04:11	98.9	/usr/bin/php-cgi	tools.dupdet
1411	21-00:40:14	99.9	/usr/bin/php-cgi	tools.jembot
1411	39-01:19:03	98.1	/usr/bin/php-cgi	tools.dupdet
1412	117-15:40:47	79.5	/usr/bin/php-cgi	tools.sowhy
1412	117-16:36:43	79.5	/usr/bin/php-cgi	tools.sowhy
1413	20-11:19:09	99.4	/usr/bin/php-cgi	tools.jembot
1413	99-16:41:22	98.0	/usr/bin/php-cgi	tools.croptool
1413	101-22:23:08	98.6	/usr/bin/php-cgi	tools.wsexport
1414	6-01:45:25	49.1	/usr/bin/php-cgi	tools.jembot
1414	6-01:46:21	49.1	/usr/bin/php-cgi	tools.jembot
1414	6-01:47:22	49.1	/usr/bin/php-cgi	tools.jembot
1414	6-01:47:32	49.1	/usr/bin/php-cgi	tools.jembot
1414	20-14:59:49	98.9	/usr/bin/php-cgi	tools.jembot
1414	20-23:36:58	98.9	/usr/bin/php-cgi	tools.jembot
1415	12-06:45:37	99.8	/usr/bin/php-cgi	tools.jembot
1415	110-05:29:54	95.8	/usr/bin/php-cgi	tools.dupdet
1415	137-08:25:28	99.2	/usr/bin/php-cgi	tools.wsexport
1416	20-09:10:48	98.8	/usr/bin/php-cgi	tools.jembot
1416	20-09:12:58	98.8	/usr/bin/php-cgi	tools.jembot
1416	41-10:19:46	90.5	/usr/bin/php-cgi	tools.wsexport
1416	111-15:27:57	96.1	/usr/bin/php-cgi	tools.jembot
1417	10-14:06:20	71.9	/usr/bin/php-cgi	tools.jembot
1417	10-14:11:09	71.9	/usr/bin/php-cgi	tools.jembot
1417	10-14:25:11	72.0	/usr/bin/php-cgi	tools.jembot
1417	10-14:38:52	72.1	/usr/bin/php-cgi	tools.jembot
1417	104-16:14:40	96.0	/usr/bin/php-cgi	tools.wsexport
1418	26-22:28:14	98.4	/usr/bin/php-cgi	tools.jembot
1418	26-22:29:15	98.4	/usr/bin/php-cgi	tools.jembot
1418	26-22:30:01	98.4	/usr/bin/php-cgi	tools.jembot
1418	26-22:30:06	98.4	/usr/bin/php-cgi	tools.jembot
1419	20-14:39:46	99.8	/usr/bin/php-cgi	tools.jembot
1419	21-03:42:07	99.7	/usr/bin/php-cgi	tools.jembot
1420	20-14:47:52	99.8	/usr/bin/php-cgi	tools.jembot
1420	20-14:54:44	99.8	/usr/bin/php-cgi	tools.jembot
1420	21-03:49:59	99.8	/usr/bin/php-cgi	tools.jembot
1421	22-09:58:37	98.4	/usr/bin/php-cgi	tools.jembot
1421	22-10:00:35	98.4	/usr/bin/php-cgi	tools.jembot
1421	22-10:03:00	98.4	/usr/bin/php-cgi	tools.jembot
1421	22-10:03:32	98.4	/usr/bin/php-cgi	tools.jembot
1422	25-05:58:24	98.8	/usr/bin/php-cgi	tools.jembot
1422	25-06:00:02	98.8	/usr/bin/php-cgi	tools.jembot
1422	25-06:00:47	98.8	/usr/bin/php-cgi	tools.jembot
1422	25-06:02:09	98.8	/usr/bin/php-cgi	tools.jembot
1425	23-21:45:04	98.4	/usr/bin/php-cgi	tools.jembot
1425	23-21:46:35	98.4	/usr/bin/php-cgi	tools.jembot
1425	23-21:46:58	98.4	/usr/bin/php-cgi	tools.jembot
1425	23-21:48:50	98.4	/usr/bin/php-cgi	tools.jembot
1426	24-18:01:55	98.8	/usr/bin/php-cgi	tools.jembot
1426	24-18:02:38	98.8	/usr/bin/php-cgi	tools.jembot
1426	24-18:03:03	98.8	/usr/bin/php-cgi	tools.jembot
1426	24-18:03:22	98.8	/usr/bin/php-cgi	tools.jembot
1427	24-18:30:27	99.6	/usr/bin/php-cgi	tools.jembot
1427	24-18:31:11	99.6	/usr/bin/php-cgi	tools.jembot
1427	24-19:33:26	99.8	/usr/bin/php-cgi	tools.jembot
1428	35-13:33:16	93.3	/usr/bin/php-cgi	tools.iabot
1428	134-02:15:20	97.8	/usr/bin/php-cgi	tools.wsexport

Related Objects
Search...

Status	Assigned	Task
Resolved	Andrew	T179378 some labvirt servers are at full CPU capacity
Duplicate	None	T182070 tools-webgrid-lighttpd have ~ 90 procs stuck at 100% CPU time (mostly tools.jembot)
Open	None	T132880 tools.jembot PHP processes run out of memory and leave orphan php-cgi processes regularly

Event Timeline

hashar created this task.Dec 5 2017, 10:50 AM

Restricted Application added subscribers: Cyberpower678, Danmichaelo, Aklapper. · View Herald TranscriptDec 5 2017, 10:50 AM

@-jem- if you get sometime available, can you check tools.jembot on toolforge? It seems to have a bunch of process stuck at 100% CPU usage for quite a long time :]

hashar added a parent task: T179378: some labvirt servers are at full CPU capacity.Dec 5 2017, 10:53 AM

hashar mentioned this in T179378: some labvirt servers are at full CPU capacity.

I stopped the webservice running under jembot for now as I'm unsure if this is an issue with this Tool or what but it had indeed leaked procs all over the webgrid nodes. I then purged processes running as the tools.jembot user.

Chase killed off all of jem's processes, and CPU usage is plummeting.

Awesome. Running the script I pasted earlier, there are way less rogue process left:

Instance	Cumulative time	Utilization	Command	User
1401	150-17:59:56	94.9	/usr/bin/php-cgi	tools.blockcalc
1404	142-11:53:44	89.7	/usr/bin/php-cgi	tools.wam
1405	146-13:55:05	92.3	/usr/bin/php-cgi	tools.dupdet
1407	129-10:19:01	99.4	/usr/bin/php-cgi	tools.wsexport
1408	38-14:04:16	85.7	/usr/bin/php-cgi	tools.spellcheck
1408	41-17:03:19	86.4	/usr/bin/php-cgi	tools.spellcheck
1408	49-06:42:05	97.4	/usr/bin/php-cgi	tools.pinyin-wiki
1408	51-01:59:18	94.7	/usr/bin/php-cgi	tools.spellcheck
1409	15-11:50:01	48.8	/usr/bin/php-cgi	tools.jembot
1409	15-11:51:30	48.8	/usr/bin/php-cgi	tools.jembot
1409	15-11:51:41	48.8	/usr/bin/php-cgi	tools.jembot
1409	15-11:51:59	48.8	/usr/bin/php-cgi	tools.jembot
1410	157-11:49:45	99.1	/usr/bin/php-cgi	tools.supercount
1411	10-23:45:03	93.9	/usr/bin/php-cgi	tools.dupdet
1412	89-14:12:51	74.7	/usr/bin/php-cgi	tools.sowhy
1412	89-15:08:37	74.7	/usr/bin/php-cgi	tools.sowhy
1413	71-18:12:53	97.5	/usr/bin/php-cgi	tools.croptool
1413	73-23:54:30	98.3	/usr/bin/php-cgi	tools.wsexport
1415	82-04:02:55	94.5	/usr/bin/php-cgi	tools.dupdet
1415	109-06:58:47	99.1	/usr/bin/php-cgi	tools.wsexport
1416	13-13:51:52	76.8	/usr/bin/php-cgi	tools.wsexport
1416	83-19:04:48	95.1	/usr/bin/php-cgi	tools.jembot
1417	77-04:03:09	95.4	/usr/bin/php-cgi	tools.wsexport
1428	7-14:07:06	75.8	/usr/bin/php-cgi	tools.iabot
1428	106-02:49:12	97.3	/usr/bin/php-cgi	tools.wsexport

This was probably not the fault of the tool. We have seen this behavior from webservice occasionally in the past. For currently undiagnosed reasons, the internal watchdog process that checks to see if a webservice job is running for a given tool can lose track of running job(s) and begin spawning a new job on each pass through the status reconciliation loop. The "fix" has historically been to:

Stop the webservice using webservice stop
Delete the tool's $HOME/service.manifest file
Wait a small amount of time for the watchdog process to notice this and stop trying to manage the job
Use qdel to stop all of the running jobs for the tool
Look across the job grid for "orphaned" processes owned by the tool and kill them
Restart the tool's webservice

zhuyifei1999 subscribed.Dec 5 2017, 5:41 PM

I'm sorry for the inconveniences and for the late answer, I was delayed with the mail. I don't know if my code is to blame, I haven't made any recent changes, but I'll pay more attention in the future. Also, if needed, you can reach me much faster as jem in IRC; I have permanent connection. Thanks for the intervention.

@-jem- it looks like an issue with the webservice system. Though your bot definitely exacerbate the issue.

Chicocvenancio subscribed.Feb 9 2018, 8:32 AM

I went around and killed a bunch of processes on the grid during my evening hours last night that were owned by tools but had pid 1 as the parent process. This seems to be a way to detect processes that are leaking out of Grid Engine's control. The orphan procs I killed belonged to jembot, iabot, dupdet, and wsexport. I also found that xlinkbot, linkwatcher, and coibot seem to manage to make pid 1 owned processes by design somehow.

Just now I ran the command I dreamed up to search for these and found four new orphaned /usr/bin/php-cgi processes owned by jembot. The information I got from qstat -xml as tools.jembot showed that the grid job had been recently started. The logged job was running on a different grid node. Looking at https://tools.wmflabs.org/grid-jobs/tool/jembot to see the history of things I see that the lighttpd-jembot process has been submitted to the grid 1010 times in the past 7 days. That is a lot of crashing for a php webservice. I went poking around in the tool's $HOME to see if there was a sign of why. The $HOME/service.log did not show a pattern of many restarts. I did find a $HOME/.bigbrotherrc~ file that showed that at some point bigbrother had been setup to run webservice. This won't work on a number of levels, but I think that bigbrother would actually complain about that not being a watchable job. I checked /data/project/.system/bigbrother.scoreboard and do not see jembot listed as a watched job. I then stopped the webservice, removed the service.manifest and service.log files, and started the webservice back up. I'll try to remember to check in again in a day or so and see if the pattern of restarts and orphan processes reappears or not.

Maybe we should take a look at https://arc.liv.ac.uk/SGE/howto/remove_orphaned_processes.html and see if any of the techniques for automatically killing orphan processes described there will work on our grid.

In T182070#3959325, @bd808 wrote:

I'll try to remember to check in again in a day or so and see if the pattern of restarts and orphan processes reappears or not.

There are another 4 orphan /usr/bin/php-cgi processes started by tools.jembot on tools-webgrid-lighttpd-1426.tools.eqiad.wmflabs. The webservice is currently running on tools-webgrid-lighttpd-1411.tools.eqiad.wmflabs.

tools.jembot@tools-bastion-02:~$ tail service.log
2018-02-10T09:00:08.001409 No running webservice job found, attempting to start it
2018-02-10T11:20:07.390117 No running webservice job found, attempting to start it
2018-02-10T12:10:07.528442 No running webservice job found, attempting to start it
2018-02-10T12:30:07.276491 No running webservice job found, attempting to start it
2018-02-10T12:40:08.987820 No running webservice job found, attempting to start it
2018-02-10T17:40:06.110219 No running webservice job found, attempting to start it
2018-02-10T17:40:23.206703 Timed out attempting to start webservice (15s)
2018-02-10T19:40:10.160176 No running webservice job found, attempting to start it
2018-02-11T01:00:07.525414 No running webservice job found, attempting to start it
2018-02-11T03:40:08.663015 No running webservice job found, attempting to start it

It looks like jembot's webservice continues to die on a regular basis and sometimes when it dies grid engine / linux does not properly clean up the php runners that were started by lighttpd.

The error.log file contains 74,943 lines like:

2018-02-11 01:40:06: (server.c.1558) server stopped by UID = 0 PID = 22028

These go back to 2016-10-17.

It seems the faulty webgrid jobs have pilled up. If one could kill the stuck /usr/bin/php-cgi processes by tools.jembot, that would be nice :]

In T182070#4002514, @hashar wrote:

It seems the faulty webgrid jobs have pilled up. If one could kill the stuck /usr/bin/php-cgi processes by tools.jembot, that would be nice :]

confirmed, I see tons of leakage via clush -w @all 'sudo pidstat -U tools.jembot' | grep jem

culled things with

clush -w @all 'sudo /usr/bin/pkill --signal 9 -u tools.jembot'

Awesome. Looks like most of them happened to be scheduled on labvirt1013 which has a nice drop in CPU/load.

bd808 mentioned this in T132880: tools.jembot PHP processes run out of memory and leave orphan php-cgi processes regularly.Mar 5 2018, 10:46 PM

bd808 added a subtask: T132880: tools.jembot PHP processes run out of memory and leave orphan php-cgi processes regularly.Mar 11 2018, 3:17 PM

JJMC89 merged a task: T192736: jem tool processes on several tools-webgrid-* instances at 100 % user CPU.Apr 22 2018, 4:53 PM

JJMC89 added a subscriber: Nemo_bis.

In T192736, @Nemo_bis wrote:
6 instances and counting are listed on https://grafana-labs.wikimedia.org/dashboard/db/top-instances?orgId=1 as 100 % user CPU, seemingly for tools.jem processes (maybe jembot):
top - 16:34:23 up 95 days, 19:50,  1 user,  load average: 4.01, 4.11, 4.12
%Cpu(s): 36.2 us,  0.4 sy,  0.0 ni, 63.3 id,  0.1 wa,  0.0 hi,  0.0 si,  0.0 st

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
14532 tools.j+  20   0  373064  19424   8820 R 100.0  0.2  34036:27 php-cgi
14533 tools.j+  20   0  371348  12708   3940 R  98.8  0.2  34034:45 php-cgi
14536 tools.j+  20   0  371508  18384   9148 R  98.8  0.2  34034:48 php-cgi
14535 tools.j+  20   0  372328  14824   6092 R  92.7  0.2  34035:35 php-cgi
Are those tasks doing something useful or are they just stuck?

id is the tools-webgrid-lighttpd instance

id	time	cpu	cmd	user
1417	3-23:30:00	95.8	/usr/bin/php-cgi	tools.jembot
1417	3-23:30:29	95.8	/usr/bin/php-cgi	tools.jembot
1417	3-23:30:31	95.8	/usr/bin/php-cgi	tools.jembot
1417	3-23:32:53	95.9	/usr/bin/php-cgi	tools.jembot
1420	10-14:23:02	95.0	/usr/bin/php-cgi	tools.jembot
1420	10-14:23:10	95.0	/usr/bin/php-cgi	tools.jembot
1420	10-14:23:43	95.0	/usr/bin/php-cgi	tools.jembot
1420	10-14:25:20	95.0	/usr/bin/php-cgi	tools.jembot
1425	5-22:25:51	96.4	/usr/bin/php-cgi	tools.jembot
1425	5-22:26:53	96.4	/usr/bin/php-cgi	tools.jembot
1425	5-22:27:46	96.5	/usr/bin/php-cgi	tools.jembot
1425	5-22:28:26	96.5	/usr/bin/php-cgi	tools.jembot

I also noticed a few stall process by tools.dupdet:

id	time	cpu	cmd	user
1411	5-20:49:29	42.7	/usr/bin/php-cgi	tools.dupdet
1411	7-16:31:17	55.9	/usr/bin/php-cgi	tools.dupdet
1411	10-17:41:05	78.1	/usr/bin/php-cgi	tools.dupdet
1411	12-05:23:05	89.0	/usr/bin/php-cgi	tools.dupdet
1422	5-18:13:19	99.3	/usr/bin/php-cgi	tools.dupdet

Mentioned in SAL (#wikimedia-cloud) [2018-06-20T15:09:06Z] <bd808> Killed orphan processes on webgrid nodes (T182070); most owned by jembot and croptool

Here are the clush commands I have been using to first check for and then kill processes that have leaked out of grid engine due to some cleanup failure:

find-orphans

$ clush -w @exec -w @webgrid -b 'ps axwo user:20,ppid,pid,cmd | grep -Ev "^($USER|root|daemon|diamond|_lldpd|messagebus|nagios|nslcd|ntp|prometheus|statd|syslog|Debian-exim|www-data|sgeadmin)"|grep -v perl|grep -E "     1 "'

kill-orphans

$ clush -w @exec -w @webgrid -b 'ps axwo user:20,ppid,pid,cmd | grep -Ev "^($USER|root|daemon|diamond|_lldpd|messagebus|nagios|nslcd|ntp|prometheus|statd|syslog|Debian-exim|www-data|sgeadmin)"|grep -v perl|grep -E "     1 "|awk "{print \$3}"|xargs sudo kill -9'

As the author of Croptool, I see that the tool hangs from time to time and I don't really understand why. If a user requests a very large image to be cropped, it will naturally cause high CPU usage for some time, but I would expect that the PHP process eventually would be killed / time out. Let me know if there are settings I should try changing!

Earlier there was the T104799 problem, but I'm not sure if that's still a problem.

(Btw. I looked into using Kubernetes instead, but don't think I can use custom images yet: https://github.com/danmichaelo/croptool/issues/106)

hashar updated the task description. (Show Details)Jun 21 2018, 12:05 PM

In T182070#4305000, @Danmichaelo wrote:

As the author of Croptool, I see that the tool hangs from time to time and I don't really understand why. If a user requests a very large image to be cropped, it will naturally cause high CPU usage for some time, but I would expect that the PHP process eventually would be killed / time out. Let me know if there are settings I should try changing!

What I think I am seeing on the job grid from a few tools is the php fastcgi server (/usr/bin/php-cgi) that is launched by lighttpd leaking child processes. The commands shown in T182070#4303107 are looking for running processes with a parent process id of 1. This is a hallmark of an orphan process where the parent process has ended before the child. My hunch is that this is more likely to happen to tools where the PHP scripts are doing large or slow IO operations. PHP is not well known for good interrupt handling which makes me suspect that the orphan process is in a polling loop of some kind waiting to send or receive data and did not respond to the SIGHUP sent by the parent process as it closed.

When I have looked into this more closely in the past I typically found that the webservice for the tool that had left orphans was not actively running on that same exec node in the job grid. This in turn makes me suspect that grid engine itself is part of the problem. A likely scenario goes something like:

grid spawns a lighttpd process as the webservice for tool X
tool X does some work and eventually exceeds the memory limit that the grid granted it
the grid notices the memory limit violation and terminates the lighttpd process
one or more PHP fcgi processes are busy and do not respond to the signal sent by lighttpd when the grid kills it
processes leak and the grid engine scheduler is unaware of them in its accounting to determine if more work can be sent to that grid exec node
the grid node gets overloaded and everything there suffers as a result

There are some things that we might try to tune in the grid engine deploy itself to make it more aggressive about terminating child processes when shutting down a job. I have kind of been hoping that we can wait until we have deployed a newer version of grid engine to dig too far into this however. In the next 9 months we will be updating to "son of grid engine" packaged by the Debian upstream and this grid engine derivative project has some improvements in how it tracks and manages job processes that may make this either go away or at least be less likely to occur.

(Btw. I looked into using Kubernetes instead, but don't think I can use custom images yet: https://github.com/danmichaelo/croptool/issues/106)

You are correct. We still do not have a mechanism to add packages to one of our Kubernetes images for a specific tool (for example an apt-get install ... when first started) nor do we have a way for tools to upload their own custom Docker containers.

Mentioned in SAL (#wikimedia-cloud) [2018-06-29T16:46:08Z] <bd808> Killed orphan tool owned processes running on the job grid. Mostly jembot and wsexport php-cgi processes stuck in deadlock following an OOM. T182070

Danmichaelo mentioned this in T198503: CropTool sometimes locks and have to be manually restarted.Jun 29 2018, 5:02 PM

bd808 closed this task as a duplicate of T153281: webgrid-lighttpd queues kill OOM jobs with SIGKILL leaving php-cgi processes behind.Jul 12 2018, 11:42 PM

	F11189256: lighttpd_cpu_usage_one_week.png
	Dec 5 2017, 10:50 AM

	F11195119: jembotdeath.png
	Dec 5 2017, 3:42 PM

tools-webgrid-lighttpd have ~ 90 procs stuck at 100% CPU time (mostly tools.jembot)Closed, DuplicatePublicActions

Description

TLDR workaround

Related ObjectsSearch...

Event Timeline

tools-webgrid-lighttpd have ~ 90 procs stuck at 100% CPU time (mostly tools.jembot)
Closed, DuplicatePublic
Actions

Related Objects
Search...