Page MenuHomePhabricator

Revisit the logging work done on Q1 2017-2018 for the standard pod setup
Closed, ResolvedPublic

Description

in Q1 2017-2018 a bullet point of a quarterly goal was Standardize on a "default" pod setup T170120. While the bullet (and the entire goal) was successfully completed, the end result for logging (which was log directly to logstash from the application) did have a number of drawbacks, most notably:

  • Standard kubernetes tools like kubectl logs don't work with it
  • It is more involved and complicated to differentiate between development and production environments as it needs extra configuration
  • Moves a lot of the logic and responsibility of logging into each application, making divergence more probable and adding complexity (and bugs) into the application.
  • While a supported configuration, it is not one used by many in the kubernetes community, meaning it's a tad more difficult to have support for it

Back then fluentd was evaluated, and containers built and tried. Unfortunately, due to changes and restructuring in the fluentd software some functionality was not working

In the meantime, Infrastructure Foundations has adopted the logging infrastructure and is moving forward with aa redesign of it in T205849. The recommended solution for log shipping in this is using rsyslog which does support kubernetes.

This task is about:

  • Evaluating rsyslog (with the mmkubernetes plugin) vs (newer version of) fluentd
  • Settling down on one of the two for shipping logs from applications in our kubernetes environments
  • Implementing the solution

Event Timeline

akosiaris triaged this task as Medium priority.Oct 16 2018, 6:27 PM
akosiaris created this task.

rsyslog

Prereqs

Needs rsyslog 8.37 from stretch-backports. Needs a rebuild with enabling of the mmkubernetes module

Also brings libfastjson4 liblognorm5

Caveats

The file needs to match https://www.rsyslog.com/doc/v8-stable/configuration/modules/mmkubernetes.html#filenamerules (it does in our case)

Caveats: If for any reason rsyslog fails to get results from the API about a pod it will cache a negative result forever. Restart rsyslog to fix

Configuration

module(load="imfile")

input(type="imfile" File="/var/log/containers/*.log" tag="kubernetes" reopenOnTruncate="on" addMetadata="on")

module(load="mmkubernetes" KubernetesURL="http://localhost:8080")
action(type="mmkubernetes")

Then with a simple

*.* -/var/log/all.log;RSYSLOG_DebugFormat

we get

Without the action(type="mmkubernetes")

Debug line with all properties:
FROMHOST: '', fromhost-ip: '', HOSTNAME: 'contrib-stretch', PRI: 133,
syslogtag 'kubernetes', programname: 'kubernetes', APP-NAME: 'kubernetes', PROCID: '-', MSGID: '-',
TIMESTAMP: 'Dec 14 14:30:29', STRUCTURED-DATA: '-',
msg: '{"log":"172.17.0.1 - - [14/Dec/2018:14:30:29 +0000] \"GET /dsf HTTP/1.1\" 200 393 \"-\" \"curl/7.52.1\"\n","stream":"stdout","time":"2018-12-14T14:30:29.716133687Z"}'
escaped msg: '{"log":"172.17.0.1 - - [14/Dec/2018:14:30:29 +0000] \"GET /dsf HTTP/1.1\" 200 393 \"-\" \"curl/7.52.1\"\n","stream":"stdout","time":"2018-12-14T14:30:29.716133687Z"}'
inputname: imfile rawmsg: '{"log":"172.17.0.1 - - [14/Dec/2018:14:30:29 +0000] \"GET /dsf HTTP/1.1\" 200 393 \"-\" \"curl/7.52.1\"\n","stream":"stdout","time":"2018-12-14T14:30:29.716133687Z"}'
$!:{ "metadata": { "filename": "\/var\/log\/containers\/echo-7776c5bb4-6wrnf_default_echo-ab65354e76bd241eb496f9ce90e9e67ee1ea255752edab0e6f7c3d50c692477c.log", "fileoffset": "664" } }
$.:
$/:

with the action(type="mmkubernetes")

Debug line with all properties:
FROMHOST: '', fromhost-ip: '', HOSTNAME: 'contrib-stretch', PRI: 133,
syslogtag 'kubernetes', programname: 'kubernetes', APP-NAME: 'kubernetes', PROCID: '-', MSGID: '-',
TIMESTAMP: 'Sep 25 13:07:27', STRUCTURED-DATA: '-',
msg: '{"log":"172.17.0.1 - - [25/Sep/2018:13:07:27 +0000] \"GET /dsf HTTP/1.1\" 200 393 \"-\" \"curl/7.52.1\"\n","stream":"stdout","time":"2018-09-25T13:07:27.848952843Z"}'
escaped msg: '{"log":"172.17.0.1 - - [25/Sep/2018:13:07:27 +0000] \"GET /dsf HTTP/1.1\" 200 393 \"-\" \"curl/7.52.1\"\n","stream":"stdout","time":"2018-09-25T13:07:27.848952843Z"}'
inputname: imfile rawmsg: '{"log":"172.17.0.1 - - [25/Sep/2018:13:07:27 +0000] \"GET /dsf HTTP/1.1\" 200 393 \"-\" \"curl/7.52.1\"\n","stream":"stdout","time":"2018-09-25T13:07:27.848952843Z"}'
$!:{ "metadata": { "filename": "\/var\/log\/containers\/echo-7776c5bb4-6wrnf_default_echo-5415aee980e3d777103ae6d67e29790a542023654c959f4548b46ea40dfda213.log", "fileoffset": "2985" }, "kubernetes": { "namespace_id": "c48f8018-bdb8-11e8-af8a-0800278dc04d", "creation_timestamp": "2018-09-21T16:09:50Z", "pod_id": "fa35cb79-c0aa-11e8-826a-0800278dc04d", "labels": { "pod-template-hash": "333271660", "run": "echo" }, "host": "127.0.0.1", "pod_name": "echo-7776c5bb4-6wrnf", "namespace_name": "default", "container_name": "echo", "master_url": "http:\/\/localhost:8080" }, "docker": { "container_id": "5415aee980e3d777103ae6d67e29790a542023654c959f4548b46ea40dfda213" } }
$.:
$/:

Namely the $! var and thus the metadata are populated with pod information

Sending to ELK

Which means we need to massage the log to create anything useful out of it.

A simple template() to take leverage of the above would be

template(name="syslog_json" type="list" option.json="on") {

constant(value="{")
constant(value="\"timestamp\":\"")
property(name="timereported" dateFormat="rfc3339")
constant(value="\",\"message\":\"")
property(name="msg" format="json")
constant(value="\",\"host\":\"")
property(name="hostname")
constant(value="\",\"logsource\":\"")
property(name="hostname")
constant(value="\",\"severity\":\"")
property(name="syslogseverity-text")
constant(value="\",\"facility\":\"")
property(name="syslogfacility-text")
constant(value="\",\"program\":\"")
property(name="programname")
constant(value="\",\"pod_name\":\"")
property(name="$!kubernetes!pod_name")
constant(value="\",\"namespace_name\":\"")
property(name="$!kubernetes!namespace_name")
constant(value="\",\"container_name\":\"")
property(name="$!kubernetes!container_name")
constant(value="\"}")

}

which populates aside from the usual stuff also pod, container and namespace names per log line.

Given zotero's mediocre logging it does not really make logging much better but that's unrelated to mmkubernetes. Shipping them to logstash is relatively easy via the kafka work that infrastructure foundations has been doing

A simple example is

# this is the kafka output config, so load the kafka output module

module(load="omkafka")

# load mmrm1stspace to remove leading space from msg field in output
# (leading space breaks existing grok/regex message filters)
module(load="mmrm1stspace")

# define a template to be used by omkafka dynatopic
template(name="kafka_topic" type="string" string="rsyslog-%syslogseverity-text%")

# send to kafka if lookup table contains "kafka" for relevant programname
# $.log_outputs defined by lookup table in lookup_output.conf
if ( $.log_outputs contains "kafka" ) then {

    # remove leading white space from msg field
    action(type="mmrm1stspace")

    action(type="omkafka"
           broker=["logstash1004.eqiad.wmnet:9093","logstash1005.eqiad.wmnet:9093","logstash1006.eqiad.wmnet:9093"]
           topic="kafka_topic"
           dynatopic="on"
           dynatopic.cachesize="1000"
           template="syslog_json"
           confParam=[ "security.protocol=ssl",
                       "ssl.ca.location=/etc/ssl/certs/Puppet_Internal_CA.pem",
                       "compression.codec=snappy",
                       "socket.timeout.ms=10000",
                       "socket.keepalive.enable=true",
                       "queue.buffering.max.ms=50",
                       "batch.num.messages=1000" ]
    )

}

This is already very generic, reusable, has been tested and has been found to work.

Parsing

Parsing unstructured logs happens via mmnormalize and liblognorm. An example is below

action(type="mmnormalize"
  rulebase="/opt/rsyslog/apache.rb"
)

where apache.rb will contain rules for parsing apache logs, that can look like this:

version=2

rule=:%clientip:word% %ident:word% %auth:word% [%timestamp:char-to:]%] "%verb:word% %request:word% HTTP/%httpversion:float%" %response:number% %bytes:number% "%referrer:char-to:"%" "%agent:char-to:"%"%blob:rest%

Fluentd

Prereqs

Running fluent-bit on every kubernetes worker. The kubernetes community generally does this with a DaemonSet approach.

Configuration

Looks like something like

[INPUT]
    Name             tail
    Path             /var/log/containers/*.log
   Parser           docker
    Tag              kube.*
    Refresh_Interval
    5
    Mem_Buf_Limit    5MB
    Skip_Long_Lines  On

[FILTER]
    Name                kubernetes
    Match               kube.*
    Kube_URL            https://kubernetes.default.svc:443
   Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
   Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
   Merge_Log           On
    K8S-Logging.Parser  On
    K8S-Logging.Exclude
    On

is enough to enable kubernetes support.

When Fluent Bit runs, it will read, parse and filter the logs of every POD and will enrich each entry with the following information (metadata):

  • POD Name
  • POD ID
  • Container Name
  • Container ID
  • Labels
  • Annotations

Which is exactly the same information we get via mmkubernetes (actually mmkubernetes needs some extra configuration for the annotations but it does work).

Has been tested and works

Parsing

Writing parsers in fluentd is easy

[PARSER]
    Name        docker
    Format      json
    Time_Key    time
    Time_Format %Y-%m-%dT%H:%M:%S %z

or

[PARSER]
    Name   apache
    Format regex
    Regex  ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
    Time_Key time
    Time_Format %d/%b/%Y:%H:%M:%S %z

which would easily allows us to make more sense out of unstructured logs.

Sending to ELK

Configs say it's easy enough.

[OUTPUT]
    Name        kafka
    Match       *
    Brokers     192.168.1.3:9092
    Topics      test

Not tested however.

Conclusions

I am leaning towards not adding one more piece of complexity since it seems both fullfil the same purposes. That being said, it does seem like fluent-bit is a more convienient as a solution, however that could be offset by the added complexity and deviation from the standard logging solution

As discussed over IRC, probably they are both feature wise equivalent. I'm more inclined towards fluentd/fluentbit because is the "standard" logging aggregation solution for clusters.

However, there are also good reasons to go with rsyslog, namely standardization.

As long as we keep the logging contract as:

  • Applications will log to stdout, logging daemon will collect and ship logs.
  • Cluster components logging (apiserver, scheduler, kubelets, rbac audit log) should be prioritized over application logging since they should be used to debug issues with the cluster.

I'm happy with any solution :)

Thanks for looking into this! As far as my understanding goes (and please correct me if I'm wrong) the job to do is:

  1. tail json logs matching /var/log/containers/*.log, one json object per line
  2. augment each json entry with metadata obtained by talking https to the k8s api (pod/namespace/container/etc names)
  3. [optional and/or to be done at a later stage?] further parse log attribute into more structured attributes (e.g. apache logs mentioned above)
  4. send the resulting json objects to logstash/elasticsearch for ingestion

Assuming the above is correct, I'd propose to split the job in two parts: namely tail+augment and shipping. With fluentbit handling the former and rsyslog the latter. In other words treat fluentbit's output as any other application logs from rsyslog's POV, with the idea being to keep the two decoupled.

IMHO having rsyslog aware of k8s would bring some challenges (e.g. fixes for mmkubernetes in a new rsyslog upstream version, do we bring in updated rsyslog for k8s only or upgrade rsyslog fleetwide?) and similarly fluentbit talking to kafka directly would require fluentbit's configuration know somehow about kafka broker hostnames (e.g. puppet configures fluentbit's output correctly?)

I'm thus proposing fluentbit to do the tail + augmentation part and rsyslog to ship fluentbit's output. In practical terms it would mean configure fluentbit to write output to a file (afaics fluentbit can't output to syslog?) and have rsyslog pick that file up and ingest the json records into kafka/logstash/elasticsearch as usual. What do you think?

@fselles : to your point re: cluster components logging, afaics those run as systemd units so it is easier to onboard them into the logging pipeline, the priority question over application logging remains open though.

Thanks for looking into this! As far as my understanding goes (and please correct me if I'm wrong) the job to do is:

  1. tail json logs matching /var/log/containers/*.log, one json object per line
  2. augment each json entry with metadata obtained by talking https to the k8s api (pod/namespace/container/etc names)
  3. [optional and/or to be done at a later stage?] further parse log attribute into more structured attributes (e.g. apache logs mentioned above)
  4. send the resulting json objects to logstash/elasticsearch for ingestion

Assuming the above is correct, I'd propose to split the job in two parts: namely tail+augment and shipping. With fluentbit handling the former and rsyslog the latter. In other words treat fluentbit's output as any other application logs from rsyslog's POV, with the idea being to keep the two decoupled.

IMHO having rsyslog aware of k8s would bring some challenges (e.g. fixes for mmkubernetes in a new rsyslog upstream version, do we bring in updated rsyslog for k8s only or upgrade rsyslog fleetwide?)

I am inclined to say fleetwide here. But I don't really expect a high rate of fixes, to be honest. I would be surprised if it was >1 per year.

and similarly fluentbit talking to kafka directly would require fluentbit's configuration know somehow about kafka broker hostnames (e.g. puppet configures fluentbit's output correctly?)

That's a more interesting question and heavily depends on the method of deployment (debs+puppet or helm)? I see potential for divergence and the corresponding issues creeping up.

I'm thus proposing fluentbit to do the tail + augmentation part and rsyslog to ship fluentbit's output. In practical terms it would mean configure fluentbit to write output to a file (afaics fluentbit can't output to syslog?) and have rsyslog pick that file up and ingest the json records into kafka/logstash/elasticsearch as usual. What do you think?

Fluentbit indeed can't output to syslog so we would indeed have to do the write to file and have syslog pick that file to ship to kafka. At which point we can ask the question of why did we go through the trouble of installing fluentbit in the first place if only to have every log line being written twice to disk. Just for the kubernetes metadata augmenting and optionally some easier parsing?. Adding one more moving and to be maintained part and it's complexity just for that seems to complicate things needlessly.

As far as the contract @fselles proposes goes, many +1s to it

Thanks for looking into this! As far as my understanding goes (and please correct me if I'm wrong) the job to do is:

  1. tail json logs matching /var/log/containers/*.log, one json object per line
  2. augment each json entry with metadata obtained by talking https to the k8s api (pod/namespace/container/etc names)
  3. [optional and/or to be done at a later stage?] further parse log attribute into more structured attributes (e.g. apache logs mentioned above)
  4. send the resulting json objects to logstash/elasticsearch for ingestion

Assuming the above is correct, I'd propose to split the job in two parts: namely tail+augment and shipping. With fluentbit handling the former and rsyslog the latter. In other words treat fluentbit's output as any other application logs from rsyslog's POV, with the idea being to keep the two decoupled.

IMHO having rsyslog aware of k8s would bring some challenges (e.g. fixes for mmkubernetes in a new rsyslog upstream version, do we bring in updated rsyslog for k8s only or upgrade rsyslog fleetwide?)

I am inclined to say fleetwide here. But I don't really expect a high rate of fixes, to be honest. I would be surprised if it was >1 per year.

My hesitation around mmkubernetes bugfixes stems from the fact that it has been introduced upstream with a commit in 2016 and all remaining commits (~15) are from 2018. fluentd is similarly young, though with the advantage in this case that its deployment on upgrade is limited to k8s clusters and not fleetwide.

Anyways for mmkubernetes we should followup with upstream re: caching negative API results issue you've identified.

and similarly fluentbit talking to kafka directly would require fluentbit's configuration know somehow about kafka broker hostnames (e.g. puppet configures fluentbit's output correctly?)

That's a more interesting question and heavily depends on the method of deployment (debs+puppet or helm)? I see potential for divergence and the corresponding issues creeping up.

I'm thus proposing fluentbit to do the tail + augmentation part and rsyslog to ship fluentbit's output. In practical terms it would mean configure fluentbit to write output to a file (afaics fluentbit can't output to syslog?) and have rsyslog pick that file up and ingest the json records into kafka/logstash/elasticsearch as usual. What do you think?

Fluentbit indeed can't output to syslog so we would indeed have to do the write to file and have syslog pick that file to ship to kafka. At which point we can ask the question of why did we go through the trouble of installing fluentbit in the first place if only to have every log line being written twice to disk. Just for the kubernetes metadata augmenting and optionally some easier parsing?. Adding one more moving and to be maintained part and it's complexity just for that seems to complicate things needlessly.

Fair enough, ideally we wouldn't need the metadata translation done by an external component at all. I still maintain that having k8s+fluentbit decoupled from rsyslog is a good thing in terms of architecture, even when considering the complexity of a k8s daemonset to run. At any rate, let's unblock this, I'm +1 on mmkubernetes and see where that gets us.

As far as the contract @fselles proposes goes, many +1s to it

Sounds like a good contract, +1 indeed!

@fgiunchedi i do agree with you since fluentbit / fluentd is also the most adopted way so we can benefit of all the new features introduced. However, if we keep the contract revisit this later to include fluentd will be easy to do so there is no harm on trying this first and switch later.

:)

Change 485801 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/deployment-charts@master] blubberoid/zotero: Remove the logging sidecar container

https://gerrit.wikimedia.org/r/485801

Change 485801 merged by Alexandros Kosiaris:
[operations/deployment-charts@master] blubberoid/zotero: Remove the logging sidecar container

https://gerrit.wikimedia.org/r/485801

Change 538626 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] rsyslog: Support adding metadata to input, default to off

https://gerrit.wikimedia.org/r/538626

Change 538627 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] rsyslog: populate kubernetes configuration

https://gerrit.wikimedia.org/r/538627

Change 539419 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] rsyslog: Support adding cee tag to input file

https://gerrit.wikimedia.org/r/539419

Change 538626 merged by Alexandros Kosiaris:
[operations/puppet@production] rsyslog: Support adding metadata to input, default to off

https://gerrit.wikimedia.org/r/538626

Change 539419 merged by Alexandros Kosiaris:
[operations/puppet@production] rsyslog: Support adding cee tag to input file

https://gerrit.wikimedia.org/r/539419

Change 539519 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] rsyslog: Correctly parse docker logs

https://gerrit.wikimedia.org/r/539519

Change 538627 merged by Alexandros Kosiaris:
[operations/puppet@production] rsyslog: populate kubernetes configuration

https://gerrit.wikimedia.org/r/538627

Logs are now making it to logstash so I am gonna boldly resolve this. That being said, there is a minor straggler that needs to be resolved, namely the JSON-in-JSON parsing of logs as most services ship logs in JSON format which gets wrapped in docker's JSON. Discussion is ongoing in https://gerrit.wikimedia.org/r/539519, although the approach in that patch will probably not be chosen and the parsing of the JSON in JSON will be done in logstash

Change 539978 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] logstash: parse nested json from mmkubernetes

https://gerrit.wikimedia.org/r/539978

Change 540535 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/deployment-charts@master] zotero: kill logs

https://gerrit.wikimedia.org/r/540535

Change 540535 merged by Alexandros Kosiaris:
[operations/deployment-charts@master] zotero: kill logs

https://gerrit.wikimedia.org/r/540535

Change 539912 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] Revert "Revert "rsyslog: populate kubernetes configuration""

https://gerrit.wikimedia.org/r/539912

Change 539912 merged by Alexandros Kosiaris:
[operations/puppet@production] Revert "Revert "rsyslog: populate kubernetes configuration""

https://gerrit.wikimedia.org/r/539912

Change 539978 merged by Filippo Giunchedi:
[operations/puppet@production] logstash: parse nested json from mmkubernetes

https://gerrit.wikimedia.org/r/539978

Change 539519 abandoned by Filippo Giunchedi:
rsyslog: Correctly parse docker logs

Reason:
Superseded by I6320ce17c50

https://gerrit.wikimedia.org/r/539519

Change 541426 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/deployment-charts@master] admin: Add view clusterrolebinding

https://gerrit.wikimedia.org/r/541426

Change 541426 merged by jenkins-bot:
[operations/deployment-charts@master] admin: Add view clusterrolebinding

https://gerrit.wikimedia.org/r/541426

Change 800282 had a related patch set uploaded (by Ori; author: Ori):

[operations/puppet@production] Beta Cluster: ship logs from docker services to logstash

https://gerrit.wikimedia.org/r/800282