Page MenuHomePhabricator

Logging options for apache httpd in k8s
Closed, ResolvedPublic

Description

Let's start with the requirements:

  • We need to be able to tail/grep/search these logs with ease, for debugging purposes. Logstash is ok only if it can be fast and reliable (any lag would kill our ability to debug things while they happen)
  • We need to be able to run mtail on those logs
  • We produce, daily, around 330 GB of API access logs and 190 GB of website access logs, over all of our traffic

Solution 1

We create a directory on the k8s node that works as a hostpath in all apache containers, and we make apache write its logs there, with a filename depending on the pod name.
Mtail can then run as a DaemonSet parsing those logs.

Open issues:

  • Creates i/o on the k8s hosts
  • Log rotation would be somewhat challenging

Solution 2
We make apache send its logs to logstash and a central log server(s) by logging to a piped command

In this case, logs would not be persisted on the individual server, but sent out to a central syslog server where they'd be stored for N days (as I said, we need ~ 500 GB / day of uncompressed logs, which become ~ 100 GB/day for compressed logs. Data could also be sent to logstash (at least sampled) for further and easier analysis. Mtail would have to be run on the central log server.

Open issues:

  • might lose logs if the central server is not HA
  • needs piped logging, which is not always great for performance, and probably for us to implement a better logger than apache's own.
  • log retention

Solution 3

A node level daemon that just just waits for log messages on a unix socket, with that socket being bind mounted to pods and apache writing to it. That component can sample if wanted and route where needed. Interestingly we got already a component that can do routing and throttling with message dropping, fluent-bit.

The advantage would be that we'd get the best of both worlds, at the cost of a slightly more complex setup.

Event Timeline

Joe triaged this task as High priority.Oct 19 2020, 8:50 AM
Joe created this task.
Joe added a project: observability.

Additional datapoint that was required: we should be sending ~ 10/15k messages per second to the central log server, depending on traffic.

Couple of points

We create a directory on the k8s node that works as a hostpath in all apache containers, and we make apache write its logs there, with a filename depending on the pod name.

Another con is that unless we can rely on an env var we might have to change on-disk the configuration of apache per pod, which means every config will become a snowflake as far as hashing functions like md5 or such goes. It will also require that we effectively do mutable images (and we might want to avoid that).

Mtail can then run as a DaemonSet parsing those logs.

Or just via puppet, at least in the beginning, which should make the migration easier.

Log rotation would be somewhat challenging

Yes, quite a bit given that the node will have to inform all the apaches in all the pods that it needs to logrotate their logs.

  • Let me add that we 'll probably need to repartition k8s nodes for Solution 1 as the bulk of the disk is given to container and not to the host / fs or any dedicated log fs.

Creates i/o on the k8s hosts

Indeed, but it might or might not be an issue. We 'll need some numbers on that. Intuitively I don't think that it would become an issue, but it makes sense to keep an eye out for that.

The interesting question is how much of that will be a problem in Solution 2 as the centralization is going to just exacerbate that.

There is also a solution 3 where we hybrid between solution 1 and solution 3. A node level daemon that just just waits for log messages on a unix socket, with that socket being bind mounted to pods and apache writing to it. That component can sample if wanted and route where needed. Interestingly we got already a component that can do routing and throttling with message dropping, fluent-bit.

Just dropping a quick update here, we should schedule some time to review options. Had a brief exchange with @akosiaris and we'll get the team together for a discussion on proposed paths and collaboration.

@lmata we really need to set up a meeting to tackle the questions here and in T271822 pretty soon; we're at the point where not figuring out this stuff will harm our schedule on the mediawiki on kubernetes project. If observability has already discussed the options here, we're glad to review them beforehand.

noted @Joe! I'll reach out to you to coordinate a time to talk with the team.

At the meeting we decided it's ok to let apache log to kafka as a main method of collection. We will therefore, at least in a first iteration:

  • Log to /dev/stdout from apache, in json format
  • The container runtime will save such logs on disk
  • rsyslog will pick them up and send them to kafka

When we will start having some traffic, we might want to switch to the following:

  • Have the CustomLog directive pipe to a process that will produce the messages to kafka
  • Potentially pick separate topics for the various clusters, and the canary deployments too

We might only sample messages that we actually decide to send to the ELK stack. We will need to find ways to process these logs via mtail to keep producing the metrics we want. That might happen on the central log server for the time being, and will probably need us to find a way to feed the logs from kafka to mtail. I'll open a subtask for that.

JMeybohm lowered the priority of this task from High to Medium.Mar 3 2021, 8:07 AM

Lowering prioiry to medium as of discussion with @Joe

Joe removed Joe as the assignee of this task.Jun 28 2021, 9:43 AM
Joe moved this task from Blocked to Backlog on the MW-on-K8s board.

Change 864547 had a related patch set uploaded (by Giuseppe Lavagetto; author: Giuseppe Lavagetto):

[operations/docker-images/production-images@master] httpd-fcgi: allow logging ECS to a local rsyslog

https://gerrit.wikimedia.org/r/864547

Change 864548 had a related patch set uploaded (by Giuseppe Lavagetto; author: Giuseppe Lavagetto):

[operations/deployment-charts@master] mediawiki: allow rsyslog to process the apache logs

https://gerrit.wikimedia.org/r/864548

The two attached patches implement proposal #3

Now we just need to create the appropriate topic, named mediawiki.httpd.accesslog on both kafka-logging clusters. I'd keep the number of partitions relatively high given the traffic we expect once at regime.

Things left to do:

  • Create the kafka topic
  • Test everything works in production
  • Use benthos to replace mtail
  • Set up a sampled logstash ingestion and dashboard.

Change 864547 merged by Giuseppe Lavagetto:

[operations/docker-images/production-images@master] httpd-fcgi: allow logging ECS to a local rsyslog

https://gerrit.wikimedia.org/r/864547

Kafka and logstash ingestion points configured.

Change 864548 merged by jenkins-bot:

[operations/deployment-charts@master] mediawiki: allow rsyslog to process the apache logs

https://gerrit.wikimedia.org/r/864548

We have now the logs in kafka, and thus should also be ingested in logstash, and create a dashboard.

Once that's done, we should reduce also the retention time of the kafka topic to 1 day at most.

If we did T291645: Integrate Event Platform and ECS logs and T276972: Set up cross DC topic mirroring for Kafka logging clusters, these logs could be mirrored to Kafka jumbo and available in Hive and Turnilo too.

While that is nice in general, I don't think there's great use for these logs in Hive for analytics purposes right now. It's great to know we'll have the option in the future.

We have now the logs in kafka, and thus should also be ingested in logstash, and create a dashboard.

Once that's done, we should reduce also the retention time of the kafka topic to 1 day at most.

As per https://phabricator.wikimedia.org/T324439#8513139 retention is now set to 2 days. The logs are ingested by logstash with a drop rate of 99%.