Page MenuHomePhabricator

Docker container logs (stdout, stderr) can grow quite large
Closed, ResolvedPublic

Description

We have a free space alert on kubernetes2002 currently because shellbox-constraits log has grown to 14G already. I've killed the pod as a quick counter measure, but I think we should configure log rotation for this (or send to syslog directly? Not an option as the mmkubernetes module from rsyslog does only support json-file or journald log-driver, https://www.rsyslog.com/doc/master/configuration/modules/mmkubernetes.html).

https://docs.docker.com/config/containers/logging/configure/

Event Timeline

JMeybohm created this task.

Just to clarify, are these the logs that are stored at /var/log/pods/<deployment>/<container>/*.log? I think setting up a quick log rotate for now would be fine.

I do expect Shellbox will end up following whatever T288851: Make logging work for mediawiki in k8s comes up with.

Just to clarify, are these the logs that are stored at /var/log/pods/<deployment>/<container>/*.log?

Exactly, yes. So it's ofc not an issue of shellbox per se but something we should deal with in general.

I 've met a similar issue the other day on kubernetes1014. It was a wikifeeds pod running for a good 3 months, having restarted 1 time during that timeframe. So we ended up with 1 running container having some 7GB of logs and the previous one (which is kept around by the kubelet) consuming another 6GB or so. Fix was similar, just delete the pod and let it be recreated.

So, I agree, this is to be solved in a more general way.

Doing a logrotate would probably be more involved than it's worth it, e.g. how you tell dockerd to reopen stdout/stderr for a running container (it's dockerd that actually opens the file, not the application itself). I guess we can restart it (the running containers won't be affected), but I have no idea if it will open up new log files. Plus, kubernetes hardcodes the logging driver to json-file and manages the amount of docker containers that are around for each pod (it keeps the previous one around). So things like "log-driver": "json-file", "log-opts": { "max-size": "10m", "max-file": "3" } in /etc/docker/daemon.json won't work as expected as another component (kubelet) takes decisions as well. Plus, we want to get rid of docker and move to containerd as the kubernetes docker shim is already deprecated and will be removed in a future release.

With all of the above in mind, it might be easier to just set up a (say 1h) systemd timer on the deployment server that fetches all running pods from all namespaces, excluding kube-system, and deletes the X oldest pods. With 400 pods, X=1 and a timer of 1h, every 17 days all pods would have been refreshed. Which is just a poor man's, somewhat more deterministic version of https://github.com/asobti/kube-monkey.

Note btw, that this can't really cause an outage. When nodes come under diskpressure, pods will be evicted and migrated to other nodes. After that is done, the node will be able to receive workloads again. A nice story showcasing this is in T264195

Did you try the logrotate approach?

An important consideration in node-level logging is implementing log rotation, so that logs don't consume all available storage on the node. Kubernetes is not responsible for rotating logs, but rather a deployment tool should set up a solution to address that. For example, in Kubernetes clusters, deployed by the kube-up.sh script, there is a logrotate tool configured to run each hour. You can also set up a container runtime to rotate an application's logs automatically.

Note btw, that this can't really cause an outage. When nodes come under diskpressure, pods will be evicted and migrated to other nodes. After that is done, the node will be able to receive workloads again. A nice story showcasing this is in T264195

Good point. We could also decide to align the DiskPressure threshold with out alert threshold and forget about it than. But IMHO we should try to prevent alerts in such situations.

Discovered again by our dear colleagues @hnowlan and @elukey in T290444
Official CRI docs suggest setting max-size in docker config as well, so I think we should at least give it a try:
https://kubernetes.io/docs/setup/production-environment/container-runtimes/#docker

Change 719550 had a related patch set uploaded (by Alexandros Kosiaris; author: Alexandros Kosiaris):

[operations/puppet@production] k8s: Instruct docker to keep logs at 100M

https://gerrit.wikimedia.org/r/719550

Change 719551 had a related patch set uploaded (by Alexandros Kosiaris; author: Alexandros Kosiaris):

[operations/puppet@production] k8s: Instruct docker to keep logs at 100M, followup

https://gerrit.wikimedia.org/r/719551

Did you try the logrotate approach?

Yes, I had at some testing in the past (around 1.4 I think?). It was pretty rigid and caused log corruption. It got removed as a solution in 1.6 as I see now. See https://github.com/kubernetes/kubernetes/issues/38495 and https://github.com/kubernetes/kubernetes/pull/40634. It articulates quite well why.

Then it was readded in 1.14 (https://github.com/kubernetes/kubernetes/commit/7ffa7da1122fd6a7925ddc730a6cb508e7f2f389), then ifguarded and defaulted to disabled (https://github.com/kubernetes/kubernetes/commit/adf7ed4241aea6571bb532c91b434e7f4494b421). And unless I am reading it wrong, it would be broken our env (notice how it logrotates /var/log/pods/*/*.log whereas we got logs in /var/log/pods/*/*.log which doesn't fill me with confidence that is used widely.

An important consideration in node-level logging is implementing log rotation, so that logs don't consume all available storage on the node. Kubernetes is not responsible for rotating logs, but rather a deployment tool should set up a solution to address that. For example, in Kubernetes clusters, deployed by the kube-up.sh script, there is a logrotate tool configured to run each hour.

You can also set up a container runtime to rotate an application's logs automatically.

Now, that is something worthy of revisiting.

Discovered again by our dear colleagues @hnowlan and @elukey in T290444
Official CRI docs suggest setting max-size in docker config as well, so I think we should at least give it a try:
https://kubernetes.io/docs/setup/production-environment/container-runtimes/#docker

Yes, https://gerrit.wikimedia.org/r/c/operations/puppet/+/719550 and https://gerrit.wikimedia.org/r/c/operations/puppet/+/719551

Change 719550 merged by Alexandros Kosiaris:

[operations/puppet@production] k8s: Instruct docker to keep logs at 100M

https://gerrit.wikimedia.org/r/719550

Change 719893 had a related patch set uploaded (by Alexandros Kosiaris; author: Alexandros Kosiaris):

[operations/puppet@production] k8s: Fix docker log-options key

https://gerrit.wikimedia.org/r/719893

Change 719893 merged by Alexandros Kosiaris:

[operations/puppet@production] k8s: Fix docker log-options key

https://gerrit.wikimedia.org/r/719893

Change 719551 merged by Alexandros Kosiaris:

[operations/puppet@production] k8s: Instruct docker to keep logs at 100M, followup

https://gerrit.wikimedia.org/r/719551

Change merged and deployed. All pods restarted to pick up the change as well. Let's monitor this over the next couple of weeks and hopefully resolve.

Change merged and deployed. All pods restarted to pick up the change as well. Let's monitor this over the next couple of weeks and hopefully resolve.

Everything OK?

T295627: Millions of access log entries per hour from shellbox httpd is related too, but on the logstash side.