Let's start with the requirements:
- We need to be able to tail/grep/search these logs with ease, for debugging purposes. Logstash is ok only if it can be fast and reliable (any lag would kill our ability to debug things while they happen)
- We need to be able to run mtail on those logs
- We produce, daily, around 330 GB of API access logs and 190 GB of website access logs, over all of our traffic
We create a directory on the k8s node that works as a hostpath in all apache containers, and we make apache write its logs there, with a filename depending on the pod name.
Mtail can then run as a DaemonSet parsing those logs.
- Creates i/o on the k8s hosts
- Log rotation would be somewhat challenging
We make apache send its logs to logstash and a central log server(s) by logging to a piped command
In this case, logs would not be persisted on the individual server, but sent out to a central syslog server where they'd be stored for N days (as I said, we need ~ 500 GB / day of uncompressed logs, which become ~ 100 GB/day for compressed logs. Data could also be sent to logstash (at least sampled) for further and easier analysis. Mtail would have to be run on the central log server.
- might lose logs if the central server is not HA
- needs piped logging, which is not always great for performance, and probably for us to implement a better logger than apache's own.
- log retention
I see advantages to both approaches, but I'll like to hear from the observability folks what they think is the best solution.