Page MenuHomePhabricator

Decide on logging in k8s for ShellBox
Open, MediumPublic

Description

Spinning out as a task from this series of comments on review thread

To quote @tstarling

In PHP-FPM, stdout and stderr may either be /dev/null (the default) or they can both be pipes that go to the parent process, which forwards lines to FPM's log system. There's no obvious way to write directly to stdout or stderr of a controlling terminal that started a service. The best we can do is have a shell script that starts the server and then tails a FIFO. PHP can write to the FIFO and tail -f will forward those writes to the controlling terminal. I think this can be done with the logFile option.

In CGI mode, stdout is forwarded to HTTP. logToStderr is at least guaranteed to not mess up the output.

The question is: which process is supposed to produce logs on stdout for k8s's consumption? The answer is not the FastCGI worker process, k8s doesn't have access to that, even if it could be implemented. Presumably k8s is running something that is supposed to produce logs on stdout. So the question is, what is that thing and how can we make that happen?

I have looked up solutions to this problem, and found a few relevant links:

serviceops please advise. A more generic question, how do we intend to run shellbox on k8s? What php infrastructure do we intend to use? Having that answer I can go search for more relevant solution ideas.

Event Timeline

The "official" PHP docker images use the same method for FPM (output redirection) to get worker logs into stdout so they can be ingested: https://github.com/docker-library/php/blob/d6838d8e45f6356d867dbc8ff8b3513492cba500/7.3/buster/fpm/Dockerfile#L230

This is also what we're using in our own MediaWiki PHP images that use the above as base.

This also relies on FPM's catch_workers_output and decorate_workers_output options to redirect workers' stdout and stderr into the main error log FPM is configured to use, without messing up the formatting of the messages logged by applayer. However, the latter param is only available from PHP 7.3 onwards.

https://pracucci.com/php-on-kubernetes-application-logging-via-unix-pipe.html provides two options for how php-fit could be set up to get the logs from worker processes.

The first option has a lot of drawbacks. The second option is what I proposed in my Gerrit comment.

Another option which occurs to me is to use a higher-numbered FD.

# /opt/php7.3/sbin/php-fpm -F 4>&1
[23-Sep-2020 10:05:20] NOTICE: fpm is running, pid 47854
[23-Sep-2020 10:05:20] NOTICE: ready to handle connections

# lsof -p 47855
...
php-fpm 47855 www-data    0u   CHR                1,3      0t0       5 /dev/null
php-fpm 47855 www-data    1u   CHR                1,3      0t0       5 /dev/null
php-fpm 47855 www-data    2u   CHR                1,3      0t0       5 /dev/null
php-fpm 47855 www-data    4u   CHR              136,2      0t0       5 /dev/pts/2

PHP does not close higher numbered FDs, it lets them be inherited by the worker process. Stdout is duplicated to FD 4, so when the worker writes to FD 4, it will appear to Kubernetes as if it came from stdout. I used FD 4 in this example to avoid conflict with limit.sh which uses FD 3 for a similar purpose.

@Joe says that application logs can go to logstash while php-fpm error logs will go to k8s. I want to say that the simplest way to send application logs to logstash is to use the logToClient option, which serializes log entries and sends them back to MediaWiki. This way, the configuration stays the same, and we don't need to port MediaWiki's custom logstash formatter. You lose log messages generated after the response is sent, but currently that's just temporary directory teardown failures which probably don't matter. For php-fpm error logs, we can start it with "php-fpm -F 2>&1", then logs go to php-fpm's stderr which is redirected to k8s's stdout.