Page MenuHomePhabricator

eventutilities-python manager should set up python logging with ECS format
Closed, ResolvedPublic

Description

Our flink helm charts configure ECS for Java logging. We should do this for python logs as well.

https://www.elastic.co/guide/en/ecs-logging/python/current/installation.html

To do this, eventutililties-python manager should support log configuration from a file. We should add a param --log_config=path/to/log.config that is passed to logging.config.fileConfig, and set the defaults to use ECS logging (basically this but in configparser format).

Done is
  • evenutilities_python.manager.load_config supports configuring python logging from config file
  • Either: default logging format is set to ECS, or our helm chart provides a logging config file that sets logging format to ECS.

Details

TitleReferenceAuthorSource BranchDest Branch
Upgrade eventutilities-python to 0.7.0repos/data-engineering/mediawiki-event-enrichment!55ottoeventutilities-0.7.0main
Add Log Configuration Supportrepos/data-engineering/eventutilities-python!48tchinlog-configmain
Customize query in GitLab

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Ottomata updated the task description. (Show Details)

Hm, @tchin! I'm looking more closely at the python production logs, and um, they look like they are in ECS format already!

{"@timestamp":"2023-05-04T20:00:57.977Z","log.level":"ERROR","message":"Failed requesting wiki content from de.wikipedia.org for page_id 12658243, rev_id 233363787: {'badrevids': {'233363787': {'revid': 233363787, 'missing': True}}} ", "ecs.version": "1.2.0","process.thread.name":"grpc-default-executor-587","log.logger":"/srv/app/mediawiki_event_enrichment/page_content_change.py:84"}

EEK. Sorry. So, okay I don't know how this works then. I would like to be able to use the extra param and extra context in actual ECS/logstash fields though, so maybe ecs-python will still help us. But, I'm not sure what happens here, if the logging is already coming in ECS format. Might require some experimentation.

Okay, I deployed with this logging message, and the message looks like:

{"@timestamp":"2023-05-04T20:20:29.983Z","log.level":"ERROR","message":"Failed requesting ar.wikipedia.org content from https://api-ro.discovery.wmnet/w/api.php?action=query&format=json&formatversion=2&prop=revisions&revids=62328001&rvprop=content&rvslots=* ", "ecs.version": "1.2.0","process.thread.name":"grpc-default-executor-1","log.logger":"/srv/app/mediawiki_event_enrichment/page_content_change.py:112"}

None of the extra stuff is there as I'd like it to be. So hopefully the python ecs-logging formatter will help us. I just hope it doesn't double format?

Do we know what's turning them into ecs format in the first place?

Not 100% but pretty sure it is our k8s stdout -> logstash stuff. We should ask ServiceOps I guess? @JMeybohm what is taking stdout from a container process and turning it into ECS/JSON formatted logs I see in kubectl logs and logstash?

We set up an ECS specific logging format for Flink Java logs, and were about to for python logs as well.

The logging pipeline on k8s is described here https://wikitech.wikimedia.org/wiki/Kubernetes/Logging

There is nothing fully "converting" or "normalizing" the logs into ECS, though.

Okay, so, from reading those docs and looking at app logs in logstash, e.g. this from java and this from python, it looks like no matter what format we configure our loggers, the configured docker json-file logging driver will consume each stdout/stderr line as single log message and stick it in its own json format. That means that even though we log in ECS json, our log message is being consumed as a json string and sent as that. E.g.

{"log":"{\"@timestamp\":\"2023-05-04T20:20:05.877Z\",\"log.level\": \"INFO\",\"message\":\"Source: kafka__rc1.mediawiki.page_change -\u003e kafka__rc1.mediawiki.page_change__row_to_dict, filter_remove_canary_events, _stream_key_by_map_operator (1/1) (37c3c06bae61447af0395b811a5695bb_cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from CREATED to SCHEDULED.\", \"ecs.version\": \"1.2.0\",\"process.thread.name\":\"flink-akka.actor.default-dispatcher-16\",\"log.logger\":\"org.apache.flink.runtime.executiongraph.ExecutionGraph\"}\n","stream":"stdout","time":"2023-05-04T20:20:05.877993509Z"}

It'd be nice to have our ECS log messages consumed as is, instead of reformatted, but I'm not sure if this is possible.

Maybe @colewhite could advise us here? Is T292881: Mutate mmkubernetes k8s fields into ECS fields related?

@tchin I think we should move forward with this python ECS logging as planned. At the very least it makes reading logs via kubectl logs much easier to read. We'll have to figure out how to ship them to logstash as is outside of this ticket.

It'd be nice to have our ECS log messages consumed as is, instead of reformatted, but I'm not sure if this is possible.

We extract the structured logs from the "log"field when shipped by rsyslog input-file. The logs should show up in logstash in the appropriate format.

Is T292881: Mutate mmkubernetes k8s fields into ECS fields related?

I will close that task. We added this capability as part of T234565. There are still some fields to define in the comments, but the major ones are there.

@colewhite we are logging in ECS format for Java logs, so the log field from docker json-file format should be an ECS json string. Maybe I'm not looking in the right place, but I haven't been able to find a log message in logstash with the expected other fields, e.g. process.thread.name: flink-akka.actor.default-dispatcher-16

But I'm also not great at using logstash, so maybe I just failed finding this log message?

We extract the structured logs from the "log"field when shipped by rsyslog input-file

@tchin IIUC, this means that we should just use ecs-python then, cuz even though the log output will look a little funky in kubectl logs, logstash should do the right thing and use the ECS log message.

Change 917999 had a related patch set uploaded (by TChin; author: TChin):

[operations/deployment-charts@master] Add flink-app default log config and use it in page_content_change

https://gerrit.wikimedia.org/r/917999

Change 917999 merged by jenkins-bot:

[operations/deployment-charts@master] Add flink-app default log config and use it in page_content_change

https://gerrit.wikimedia.org/r/917999