Page MenuHomePhabricator

send wdqs logs to logstash
Closed, ResolvedPublic

Description

wdqs log messages woudl be much easier to read / analyze if they were sent to logstash.

wdqs logging is based on logback, I'm not sure we are using logback for other application, so I'll do some digging to see if there is a nice appender / encoder for it.

Event Timeline

Restricted Application added subscribers: PokestarFan, Aklapper. · View Herald Transcript
Smalyshev triaged this task as Medium priority.Aug 7 2017, 6:59 PM

Change 371939 had a related patch set uploaded (by Gehel; owner: Gehel):
[operations/puppet@production] wdqs - send logs to logstash

https://gerrit.wikimedia.org/r/371939

Change 371950 had a related patch set uploaded (by Gehel; owner: Gehel):
[wikidata/query/rdf@master] [WIP] Send logs to logstash

https://gerrit.wikimedia.org/r/371950

Change 371950 merged by jenkins-bot:
[wikidata/query/rdf@master] Send logs to logstash

https://gerrit.wikimedia.org/r/371950

Change 371939 merged by Gehel:
[operations/puppet@production] wdqs - send logs to logstash

https://gerrit.wikimedia.org/r/371939

Change 374362 had a related patch set uploaded (by Gehel; owner: Gehel):
[operations/puppet@production] wdqs - use logstach UDP instead of logstash TCP as a logback appender

https://gerrit.wikimedia.org/r/374362

Change 374362 merged by Gehel:
[operations/puppet@production] wdqs - use logstach UDP instead of logstash TCP as a logback appender

https://gerrit.wikimedia.org/r/374362

Logs are now sent to logstash, but the "host" field isn't set correctly (its value is always "%{HOSTNAME}". Some analysis:

  • logs are sent over UDP, using the net.logstash.logback.appender.LogstashSocketAppender, to logstash port 11514
  • logstash is configured with an input type=logstash, codec=json on that port
  • logstash is configured to replace the hostname for [type] == logback
mutate {
      replace => [ "host", "%{HOSTNAME}" ]
      add_tag => [ "logback", "es" ]
    }
  • the same configuration seem to work for Cassandra

Change 374512 had a related patch set uploaded (by Gehel; owner: Gehel):
[wikidata/query/rdf@master] Use the standard MDCInsertingServletFilter instead of our custom LoggingFilter

https://gerrit.wikimedia.org/r/374512

Change 374513 had a related patch set uploaded (by Gehel; owner: Gehel):
[operations/puppet@production] wdqs - change logging pattern to conform to the logback MDCInsertingServletFilter

https://gerrit.wikimedia.org/r/374513

Logs are now sent to logstash, but the "host" field isn't set correctly (its value is always "%{HOSTNAME}". Some analysis:

  • logs are sent over UDP, using the net.logstash.logback.appender.LogstashSocketAppender, to logstash port 11514
  • logstash is configured with an input type=logstash, codec=json on that port
  • logstash is configured to replace the hostname for [type] == logback
mutate {
      replace => [ "host", "%{HOSTNAME}" ]
      add_tag => [ "logback", "es" ]
    }
  • the same configuration seem to work for Cassandra

I'm also a bit weak on my logstash, but if i understand correctly %{...} basically resolves to another field in the same document. That would imply that wdqs is not sending the HOSTNAME variable. If we look at the actual documents we can see that cassandra docs have the HOSTNAME variable, and wdqs do not. So the question is, why isn't wdqs sending this?

It looks like HOSTNAME was added in https://github.com/logstash/logstash-logback-encoder/commit/0c67cff00bc0e9e03f566f53bfe55e11a60fea27

The earliest tag that includes this commit is logstash-logback-encoder-2.1. It seems unlikely we are using a version older than that. Perhaps somehow wdqs isn't using the same defaults? It seems like hostname is addded by default with a bunch of other properties in current master, via LogstashAccessFormatter.java

Oh, I was expecting %{HOSTNAME} to be interpreted by logstash itself, not as a ref in the same document. There is something about HOSTNAME being lazy loaded in recent version of logback. I'll try to find the reference...

Change 374512 merged by jenkins-bot:
[wikidata/query/rdf@master] Use the standard MDCInsertingServletFilter instead of our custom LoggingFilter

https://gerrit.wikimedia.org/r/374512

Change 374806 had a related patch set uploaded (by Gehel; owner: Gehel):
[operations/puppet@production] wdqs - tuning of logback configuration to send logs to logstash

https://gerrit.wikimedia.org/r/374806

Change 374806 merged by Gehel:
[operations/puppet@production] wdqs - tuning of logback configuration to send logs to logstash

https://gerrit.wikimedia.org/r/374806

Mentioned in SAL (#wikimedia-operations) [2017-08-30T13:16:47Z] <gehel> restarting wdqs-blazegraph and wdqs-updater on all wdqs nodes for logback config change - T172710

Logs are now correctly sent to logstash. @Smalyshev: can you review and let me know if we are missing anything? Else just close this task...

Change 374513 merged by Gehel:
[operations/puppet@production] wdqs - logging pattern to conform to the logback MDCInsertingServletFilter

https://gerrit.wikimedia.org/r/374513

Smalyshev claimed this task.