Page MenuHomePhabricator

Diamond logstash monitor fills /var/log/apache2 access log
Closed, ResolvedPublic


deployment-logstash1.eqiad.wmflabs ended up with a filled /var/

/var/log/apache2/other_vhosts_access.log was filled with lines such as: - - [17/Oct/2014:08:16:52 +0000] "GET /server-status HTTP/1.1" 301 592 "-" "Python-urllib/2.7" - - [17/Oct/2014:08:16:52 +0000] "GET /server-status HTTP/1.1" 301 592 "-" "Python-urllib/2.7" - - [17/Oct/2014:08:16:52 +0000] "\x16\x03\x01" 301 308 "-" "-" - - [17/Oct/2014:08:16:52 +0000] "\x16\x03\x01" 301 308 "-" "-"

`/var/log/diamond/diamond.log had a lot of:

[2014-10-14 19:33:37,250] [Thread-1] Error retrieving HTTPD stats for host, url '/server-status?auto': [Errno 99] Cannot assign requested address

I guess something is (was?) wrong in the Diamond collector used to monitor logstash.

I have deleted the access.log (freeing up 850MB) file and restarted diamond.

As of Nov 25th, the diamond.log is just fine. There is still a lot of requests made to /server-status though.

Version: unspecified
Severity: normal



Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 3:45 AM
bzimport set Reference to bz72175.
bzimport added a subscriber: Unknown Object (MLST).

+ Yuvi Panda: seems the diamond collector for logstash spams the apache access log with garbage requests such as ""\x16\x03\x01" :D

That might be due to monitor fatal using an HTTPS connection:


And the sequence "\x16\x03\x01" would mean an attempt to establish an SSL connection on port 80 which does not have mod_ssl.

greg triaged this task as Medium priority.Nov 24 2014, 11:26 PM
hashar set Security to None.
yuvipanda claimed this task.

No such instances are found atm, and monitor_fatals.rb is dead.

Reopening, that is still happening. The monitoring is using the vhost thus the spam ends up in /var/log/apache2/other_vhosts_access.log.

Example: - - [12/Mar/2015:09:14:24 +0000] "GET /server-status HTTP/1.1" 301 592 "-" "Python-urllib/2.7" - - [12/Mar/2015:09:14:24 +0000] "GET /server-status HTTP/1.1" 301 592 "-" "Python-urllib/2.7" - - [12/Mar/2015:09:14:24 +0000] "\x16\x03\x01" 301 308 "-" "-" - - [12/Mar/2015:09:14:24 +0000] "\x16\x03\x01" 301 308 "-" "-"

Some python script is hitting it improperly. Maybe the ganglia monitor though it does not refers to https.

root@deployment-logstash2:/# grep "server-status" etc/* -r
etc/apache2/conf-available/50-server-status.conf:# Only serve /server-status on loopback interface to local requests.
etc/apache2/conf-available/50-server-status.conf:# The default mod_status configuration enables /server-status on all
etc/apache2/conf-available/50-server-status.conf:# a more conservative configuration that makes /server-status accessible
etc/apache2/conf-available/50-server-status.conf:      <Location /server-status>
etc/apache2/conf-available/50-server-status.conf:        SetHandler server-status
etc/ganglia/conf.d/apache_status.pyconf:        value = ""

I tried changing that last file to add ?test to the end of the URL (and then did service ganglia-monitor restart), and now that shows in the log. So it's definitely the ganglia monitor.

(and then ran puppet to put it back how it was)
But what do we want to do - disable the ganglia monitor, get rid of access logs more often, or...?

hashar claimed this task.

That no more appear. The main reason was the /var being too small which is no more the case today.