This happened for croptool on 5 jul 2015:
Last lines in error.log:
[2015-07-05 03:34:03] exiftool.INFO: Exiftool executes command /data/project/croptool/vendor/phpexiftool/exiftool/exiftool -overwrite_original -quiet -TagsFromFile '/data/project/croptool/public_html//files/c1ba468a60b8a2fdbe02cdd36721d9fedf2f8835.jpg' -all:all '/data/project/croptool/public_html//files/c1ba468a60b8a2fdbe02cdd36721d9fedf2f8835_cropped.jpg' [] [] 2015-07-05 11:48:57: (server.c.1398) [note] sockets disabled, connection limit reached
Last lines in access.log:
10.68.17.145 tools.wmflabs.org - [05/Jul/2015:07:10:04 +0000] "GET /croptool/backend.php?action=exists&site=commons.wikimedia.org&title=Florazolam%20(cropped).jpg HTTP/1.1" 200 62 "https://tools.wmflabs.org/croptool/?title=Florazolam.jpg" "Mozilla/5.0 (X11; Linux i686; rv:31.0) Gecko/20100101 Firefox/31.0 Iceweasel/31.7.0" 10.68.17.145 tools.wmflabs.org - [05/Jul/2015:07:12:38 +0000] "HEAD /croptool/ HTTP/1.1" 200 0 "-" "Mozilla/5.0+(compatible; UptimeRobot/2.0; http://www.uptimerobot.com/)" 10.68.17.145 tools.wmflabs.org - [05/Jul/2015:07:16:45 +0000] "GET /croptool/ HTTP/1.1" 200 21933 "-" "Domain Re-Animator Bot (http://domainreanimator.com) - support@domainreanimator.com" 10.68.17.145 tools.wmflabs.org - [05/Jul/2015:07:17:38 +0000] "HEAD /croptool/ HTTP/1.1" 200 0 "-" "Mozilla/5.0+(compatible; UptimeRobot/2.0; http://www.uptimerobot.com/)" 10.68.17.145 tools.wmflabs.org - [05/Jul/2015:07:22:38 +0000] "HEAD /croptool/ HTTP/1.1" 200 0 "-" "Mozilla/5.0+(compatible; UptimeRobot/2.0; http://www.uptimerobot.com/)"
Note that the 'connection limit reached' message is way later than any requests in access.log.
There's a whole set of connections to the proxy open:
tcp 1 0 tools-webgrid-lighttpd-1210.tools.eqiad.wmflabs:56235 tools-webproxy-02.tools.eqiad.wmflabs:41828 CLOSE_WAIT tools.croptool 17100020 tcp 1 0 tools-webgrid-lighttpd-1210.tools.eqiad.wmflabs:56235 tools-webproxy-02.tools.eqiad.wmflabs:56003 CLOSE_WAIT tools.croptool 17135240 tcp 1 0 tools-webgrid-lighttpd-1210.tools.eqiad.wmflabs:56235 tools-webproxy-02.tools.eqiad.wmflabs:33872 CLOSE_WAIT tools.croptool 17084311 tcp 1 0 tools-webgrid-lighttpd-1210.tools.eqiad.wmflabs:56235 tools-webproxy-02.tools.eqiad.wmflabs:54753 CLOSE_WAIT tools.croptool 17150293 tcp 1 0 tools-webgrid-lighttpd-1210.tools.eqiad.wmflabs:56235 tools-webproxy-02.tools.eqiad.wmflabs:44893 CLOSE_WAIT tools.croptool 17129109
So this suggests either someone keeping the connection open, or the webproxy misbehaving and doing that.
$ sudo netstat -e -v -W 2>/dev/null | grep croptool | grep proxy | cut -b146- > croptools-inodes $ for i in `cat croptools-inodes`; do sudo find /proc/6703/fd -lname "socket:\[$i\]" -printf %A@; echo; done > croptools-timestamps $ for i in `cat croptools-timestamps`; do date "+%d/%b/%Y:%H:%M:%S" --date="@$i"; done > croptools-formatteddates
Strangely enough, most of the connections seem to be later than the last access.log entry, /and/ they seem to be batched:
$ cat croptools-formatteddates | sort | uniq -c 1 04/Jul/2015:23:09:01 15 05/Jul/2015:07:39:01 31 05/Jul/2015:08:09:01 66 05/Jul/2015:09:09:01 54 05/Jul/2015:10:09:01 38 05/Jul/2015:10:39:01 37 05/Jul/2015:11:09:01 35 05/Jul/2015:11:39:01 22 05/Jul/2015:11:58:34
with a whole batch of connections every half-hour.
Tools-webproxy-02 doesn't seem to see this connection, though, so I'm a bit at a loss to what's happening here.
Restarting the webservice solved the issue.
Using
#!/bin/bash for i in `qconf -sel | grep webgrid` do echo $i echo -------------------- ssh $i "sudo netstat -e -v -W 2>/dev/null | grep CLOSE_WAIT | sed -e 's/.*CLOSE_WAIT\s*//' | cut -d' ' -f1 | sort | uniq -c" echo -------------------- echo done
we see a few servers have a few (<=5) connections open, which is probably just actual traffic. However, two tools have large amounts of open connections:"
tools-webgrid-lighttpd-1402.eqiad.wmflabs -------------------- 103 tools.blockcalc -------------------- (...) tools-webgrid-lighttpd-1408.eqiad.wmflabs -------------------- 150 tools.geohack --------------------