Page MenuHomePhabricator

Figure out what production logging config needs to look like
Closed, ResolvedPublic

Description

Python has a very configurable logging system. Find out from Observability folks how they would like a Python app in the Kubernetes cluster to output logs such that they eventually end up in our production log aggregation system.

Event Timeline

@Joe I think you mentioned on IRC a while ago that you and/or @akosiaris had been thinking a bit about python wsgi logging for Kubernetes deployments. I'd be more than happy to sync up with anyone working on this to talk through requirements and implementation options.

@bd808. I think that if your logging is compatible to ecs logging schema (see https://github.com/elastic/ecs-logging) it will be quite ok. The design doc and reasoning from the observability team can be found at https://docs.google.com/document/d/1HYHCPvuz93nAYXQSEReUN07HQTQUF_nvltag5H_YZq4/edit#

Trying out https://github.com/elastic/ecs-logging-python has been "fun". The library assumes that any and all values placed into the "extras" of a logging event is JSON serializable content. My first page load using that formatter proved otherwise when the wsgi access log event blew up due to a socket.socket object being present in the logging context data. I'm not at all sure that I will be able to bend all of the stock Python + Django logging to fit the contrived schema that is https://github.com/elastic/ecs-logging. The upstream library does not seem to provide any tools for doing this. It seems mostly oriented towards green field development where you have no 3rd party/non-ECS aware log message generation.

@bd808: I think the desire is to standardize the field names in order to cut down on thousands of fields in logstash. So following the standard field names from ecs is all that is needed and not necessarily using elastic's library.

@bd808: I think the desire is to standardize the field names in order to cut down on thousands of fields in logstash. So following the standard field names from ecs is all that is needed and not necessarily using elastic's library.

Yes, and that's actually the most difficult part in an application like Toolforge which is using multiple upstream libraries and frameworks. Using the core fields are simple enough, but coercing all structured data to fit the extensions is the part that I'm skeptical of.

Change 674413 had a related patch set uploaded (by BryanDavis; owner: Bryan Davis):
[wikimedia/toolhub@main] [WIP] Add support for Elastic Common Schema log formatting

https://gerrit.wikimedia.org/r/674413

I was wondering what needed to be done for log shipping, but @Legoktm pointed me to T207200: Revisit the logging work done on Q1 2017-2018 for the standard pod setup which seems to indicate that the stderr logging will be picked up on the exec nodes and shipped to the ELK stack for me by some fancy rsyslog magic.

bd808 moved this task from Research needed to Review on the Toolhub board.

Change 714656 had a related patch set uploaded (by BryanDavis; author: Bryan Davis):

[operations/deployment-charts@master] toolhub: add LOGGING_CONSOLE_FORMATTER env var

https://gerrit.wikimedia.org/r/714656

Change 674413 merged by jenkins-bot:

[wikimedia/toolhub@main] system: Add support for Elastic Common Schema log formatting

https://gerrit.wikimedia.org/r/674413

Change 714656 merged by jenkins-bot:

[operations/deployment-charts@master] toolhub: add LOGGING_CONSOLE_FORMATTER env var

https://gerrit.wikimedia.org/r/714656