Page MenuHomePhabricator

Extremely high latency over NFS between kubernetes node and bastion host
Open, LowestPublic

Description

Open two windows logged onto the bastion host (I'm using tools-sgebastion-08). Become your tool. In one window, tail -f an existing file.

In the other window, connect to a kubernetes node with "webservice --backend=kubernetes python3.7 shell", and append something to that file, using "echo foo >> filename".

Watch how long it takes for that to show up in the tail output. I've captured an example in this video where it takes about 15 seconds. The bottom window is the tail on the bastion host. The top window is the k8s node. I've seen it take as long as 90 seconds. I've seen it show up in as little as 1 or 2 seconds.

I first noticed this when tailing my uwsgi.log file, and was able to reproduce it with logging from inside my django app.

Event Timeline

Top left: tail on usused bastion.
Bottom left: tail on host that is running uwsgi
Bottom right: strace on uwsgi
Top right: firefox refreshing the page

seconds of delay. considering that tail -f related to inotify I'll also show watch tail

So do you see this on the worker nodes? If not, I'd blame the cgroup limits on bastion users. You can test that by trying your tail as root instead of as a user. I'm thinking that the cgroup limits may be affecting your buffering/caching behaviors.

@Bstorm , I'm not sure if your question was addressed to me, but I don't have root on the bastions :-)

Root ssh on spare bastion. Running tail on the uwsgi host seems to decrease the latency somewhat (watch the timestamps), but if I don't run tail on the uwsgi host there is definitely significant (many seconds) latency

Mentioned in SAL (#wikimedia-cloud) [2020-06-25T21:39:24Z] <zhuyifei1999_> disabling puppet on tools-sgebastion-09 so I can play with mount settings T256426

Mentioned in SAL (#wikimedia-cloud) [2020-06-25T21:49:58Z] <zhuyifei1999_> re-enabling puppet on tools-sgebastion-09 T256426

So, from IRC chat, it's probably an NFS cache somewhere, but since we are on a slow connection to NFS we can't disable much of any NFS cache.

Thanks to everybody for diving in and working on this. In the meantime, is there some workaround I could use? Is there some other host I could log into which has better NFS connectivity?

Thanks to everybody for diving in and working on this. In the meantime, is there some workaround I could use? Is there some other host I could log into which has better NFS connectivity?

What is your real goal? Seeing logging output from your pod? If so, sending logs to stderr or stdout will make them show up in kubectl logs <pod_name> without ever hitting NFS. In my stashbot project I have 2 loggers setup: "normal" logging is written to stderr which I can read from the transient kubectl logs ... buffer and then error+ logs are *also* written out to a file on NFS for crash reporting.

Yeah, my immediate goal is just to get logging feedback during development. I've never used kubectl logs, but sounds straightforward. I'll give it a try. Thanks.

I see the part in the task description now about uwsgi.log. This is something I have been wondering how to tweak myself for a while now. We need to do something fancier that the simple concatenation of the uwsgi config generated by webservice and any user supplied $HOME/www/python/uwsgi.ini to fix this. I haven't figured out a "good" way to do a more advance merge yet though.

No joy.

$ kubectl logs spi-tools-dev-67998c5b8f-885sd
open("/usr/lib/uwsgi/plugins/python_plugin.so"): No such file or directory [core/utils.c line 3724]
! UNABLE to load uWSGI plugin: /usr/lib/uwsgi/plugins/python_plugin.so: cannot open shared object file: No such file or directory !
[uWSGI] getting INI configuration from /data/project/spi-tools-dev/www/python/uwsgi.ini

In general, you should log to stdout/stderr on Kubernetes if possible. That way you will see the logs in kubectl logs and it never goes to NFS.

They will rotate, however, so it's not exactly a log archive. Most recent logs will be there.

I believe @zhuyifei1999 found that watch tail polled the NFS differently and that was not as lagged as well.

Yup, I can confirm that "watch tail" is significantly faster than "tail -f".

Any update on this? Any kind of workaround to get real-time logging?

Not really. You're use-case demonstrates a particular behavior of NFS itself, and we don't have any immediate plans to dramatically increase network speed for that particular NFS cluster (which could reduce, but not eliminate, the buffering/locking/access behaviors of NFS). When the hardware of the cluster gets rebuilt (hopefully in the next year or so?), network latency and such is something we are going to try to improve. I don't imagine that will totally change this behavior, though.

watch tail does intermittent polling so it doesn't trigger this behavior as much. tail -f is touching on the whole file access, buffering and locking system within NFS, and there's no way around that as long as we are using NFS.

I would log whatever you are trying to follow to stdout, which should make it visible when you run kubectl logs <podname>. You can also even do kubectl logs --tail 1 --follow <podname>, which will follow the log as close to real-time as the Kubernetes system can get. That will be logging to the local disk on the exec node and then viewed with the Kubernetes API, entirely dodging NFS. @bd808 is there a way to make uwsgi itself log to stdout? I honestly have never tried that.

As I pointed out above (and opened T256482 for), kubectl logs doesn't run.

As I pointed out above (and opened T256482 for), kubectl logs doesn't run.

That's it running correctly. That's real output that our python containers unfortunately always give when they start because they try to load both python2 and python3 (IIRC), so one always fails.

I'm not getting how this is supposed to work. I've reconfigured my django app to log to stderr. That output is ending up in $HOME/uwsgi.log. When I run "kubectl logs", all I get is the error message mentioned above, and no additional output, no matter how long I wait.

$ kubectl logs spi-tools-dev-67998c5b8f-hw5f8 --follow
open("/usr/lib/uwsgi/plugins/python_plugin.so"): No such file or directory [core/utils.c line 3724]
! UNABLE to load uWSGI plugin: /usr/lib/uwsgi/plugins/python_plugin.so: cannot open shared object file: No such file or directory !
[uWSGI] getting INI configuration from /data/project/spi-tools-dev/www/python/uwsgi.ini

Eventually, I got "error: unexpected EOF", and kubectl exited. I'm assuming this was correlated with when I restarted the server.

I'm not getting how this is supposed to work. I've reconfigured my django app to log to stderr. That output is ending up in $HOME/uwsgi.log. When I run "kubectl logs", all I get is the error message mentioned above, and no additional output, no matter how long I wait.

I see the part in the task description now about uwsgi.log. This is something I have been wondering how to tweak myself for a while now. We need to do something fancier that the simple concatenation of the uwsgi config generated by webservice and any user supplied $HOME/www/python/uwsgi.ini to fix this. I haven't figured out a "good" way to do a more advance merge yet though.