"webservice stop" leaves blocking php-cgi processes behind
Closed, ResolvedPublic

Description

The webservice for my "commonshelper" tool is running, but I can't load the web page(s). Examples:

http://tools.wmflabs.org/commonshelper/index.php (tool)
http://tools.wmflabs.org/commonshelper/index_test.php (simple test page)

The pages are just loading "forver".

I got similar bug reports for multiple tools since yesterday, which were resolved with restarting the web service, but apparently not this one.


Version: unspecified
Severity: blocker
URL: http://tools.wmflabs.org/commonshelper/

bzimport added a project: Tool-Labs.Via ConduitNov 22 2014, 3:16 AM
bzimport set Reference to bz64095.
Magnus created this task.Via LegacyApr 18 2014, 4:57 PM
bzimport added a comment.Via ConduitApr 22 2014, 2:00 PM

metatron wrote:

I've seen this problem before. lighttpd webservice stops, but old php-cgi processes remain. $webservice start then starts /one/ lighhtpd process, but can't start new php-cgi's. So plain html or py is served just fine, while php requests are "stuck".

This is the output from webgrid for commonshelper:

tools-webgrid-01: (13:51:40)

608 tools.co  20   0 48668 2116 1312 S    0  0.0   0:00.03 lighttpd

11144 tools.co 20 0 281m 11m 7748 S 0 0.1 0:00.03 php-cgi
11146 tools.co 20 0 288m 11m 4764 S 0 0.1 2:04.32 php-cgi
11147 tools.co 20 0 288m 11m 4680 S 0 0.1 0:36.61 php-cgi
11148 tools.co 20 0 288m 11m 4760 S 0 0.1 1:29.41 php-cgi
11149 tools.co 20 0 288m 11m 4756 S 0 0.1 2:42.74 php-cgi

tools-webgrid-02: (13:51:40)
19567 tools.co 20 0 281m 11m 7764 S 0 0.1 0:00.01 php-cgi
19575 tools.co 20 0 283m 9844 4320 S 0 0.1 0:35.07 php-cgi
19576 tools.co 20 0 283m 9836 4312 S 0 0.1 0:01.24 php-cgi
19577 tools.co 20 0 283m 9912 4272 S 0 0.1 0:34.99 php-cgi
19578 tools.co 20 0 283m 9796 4272 S 0 0.1 0:35.88 php-cgi

I figured out this workaround. Make this a script & execute:

#!/bin/bash
webservice stop
sleep 5
ssh tools-webgrid-01 'pkill -9 -U tools.commonshelper php-cgi'
ssh tools-webgrid-02 'pkill -9 -U tools.commonshelper php-cgi'
sleep 5
webservice start

scfc added a comment.Via ConduitApr 22 2014, 2:38 PM

metatron is correct; I recently had to purge some old processes (cf. [[wikitech:Nova Resource:Tools/SAL#April 10]]).

To fix Magnus' issue, I killed the blocking php-cgi processes; the tool should be working again.

The underlying problem is that "webservice stop" uses qdel which by default uses SIGKILL. That kills the lighttpd process and its workers, but not the spawned php-cgi processes.

Testing shows that on SIGTERM lighttpd correctly ends its workers and the spawned php-cgi processes.

I recently filed bug #61102 to use SIGTERM for the general case of jsub; the same logic applies to this bug as well.

Magnus added a comment.Via ConduitApr 22 2014, 7:09 PM

Thanks Tim, metatron, it works again!

scfc added a comment.Via ConduitApr 22 2014, 7:17 PM

It works for now :-), but the general problem hasn't been solved yet.

scfc added a comment.Via ConduitApr 22 2014, 7:36 PM

Ha! I knew I had jotted down something about the problem earlier.

  • This bug has been marked as a duplicate of bug 63878 ***

Add Comment

Column Prototype
This is a very early prototype of a persistent column. It is not expected to work yet, and leaving it open will activate other new features which will break things. Press "\" (backslash) on your keyboard to close it now.