Page MenuHomePhabricator

enable restbase syslog/file logging
Closed, ResolvedPublic

Description

yesterday while investigating cassandra/restbase issues in the test cluster I noticed restbase sends all logs to logstash
I'd expect at the very least errors or abnormal conditions to be logged locally (either files or syslog) for auditing/debugging purposes

Event Timeline

fgiunchedi raised the priority of this task from to Medium.
fgiunchedi updated the task description. (Show Details)
fgiunchedi added projects: acl*sre-team, RESTBase.
fgiunchedi subscribed.

We have had this discussion a few times now. The basic issue of local logging being blocking (and thus running the risk of taking out a service when the disk fills up) has not been resolved.

We can always enable local logging temporarily for debugging, but I don't think that we should risk taking out important services like RESTBase in normal production use.

Example logging stanza for (temporary) local logging:

logging:
  name: restbase
  streams:
    # level can be trace, info, warn, error
    - level: info 
      path: /tmp/debug.log

For more options, see https://github.com/trentm/node-bunyan#stream-type-file.

Edit: Now documented at https://wikitech.wikimedia.org/wiki/RESTBase#Debugging as well.

We have had this discussion a few times now. The basic issue of local logging being blocking (and thus running the risk of taking out a service when the disk fills up) has not been resolved.

Is this a matter of principle (i.e. disk logging in general), or some specific limitation in our tool chain? Safe, non-blocking logging seems like ought to be a solvable problem.

It is solvable, for example by logging to syslog using udp with https://github.com/mcavage/node-bunyan-syslog. The vast majority of write-to-file methods however aren't safe against disks filling up. They will either block outright, or lead to OOM when async writes never complete.

Previous incident: https://wikitech.wikimedia.org/wiki/Incident_documentation/20140211-Parsoid

It is solvable, for example by logging to syslog using udp with https://github.com/mcavage/node-bunyan-syslog.

Yup, same thing I had in mind. rsyslog is automatically set up on our boxes to listen for UDP packets, so this should be good enough.

The vast majority of write-to-file methods however aren't safe against disks filling up.

This is the part I'm worried about. Specifically, rsyslog writes its logs in the /var partition, which happens to be the same one used by Cassandra.

Hm, so I think I'm saying here that we might have a larger problem on our hands. Sure, RESTBase logs would contribute to that, but we laready have all these other logs going to the same place which might fill up the /var partition.

summing up yesterdays ops/services sync up:

  • we'll take the multi-instance opportunity to separate /var for logs/data [ops]
  • we'll investigate syslog, both local and remote for restbase [services]

does that sound accurate?

summing up yesterdays ops/services sync up:

  • we'll take the multi-instance opportunity to separate /var for logs/data [ops]

Tracked in T113714: Separate /var on restbase.

  • we'll investigate syslog, both local and remote for restbase [services]

Created T113713: Allow logs to be sent to rsyslog.

I think we have to proceed on this. Right now, without local logging, if something breaks with Logstash (see T158602) we're losing all the logs completely, which is very not good. For example, in the latest 24 hours there were some spikes in 500 errors in RESTBase which I would really like to look at, but simultaneously logstash logging is somehow broken, so we've lost some potentially important info.

It seems like all the previous concerns raised here were resolved, and we can safely add async logging through syslog-udp, maybe with a more conservative log-level, error/fatal-only for example.

Or we might even consider normal file logging with aggressive log rotation, since it's enabled in all the other services and we haven't seen any issues with it.

So, we've discussed this on the team meeting and decided to move forward on this.

The final question is whether to use syslog-over-udp or normal file logging? We didn't have any problems with file logging in other services, so I'm not quite sure why should we go with syslog for RESTBase specifically?

@Pchelolo, logging directly to a file is synchronous, which is bad for performance & can cause outages. See the earlier discussion for an example of such an outage.

UDP avoids this, so I'd propose to use that if at all possible.

@Pchelolo, logging directly to a file is synchronous, which is bad for performance & can cause outages. See the earlier discussion for an example of such an outage.

Em, writing to a file is async in node by default and bunyan file stream uses fs.createWriteStream which is async, isn't it?

As I understand the incident was caused by Parsoid using logging to stderr/stdout that were redirected to a file, which are indeed blocking. File logging is non-blocking.

@Pchelolo, in the disk-full situation, writing asynchronously to files might still cause memory to build up & the service to run out of memory. It would be good to rule that out, either by testing, or by using UDP reporting.

I've done some testing on what's happening if the partition where we log fills up. I've created a VM with a 100 meg partition and started trace-logging there. When the partition fills up, an unhandled exception is raised and a worker is bing killed. This makes the master process attempt to log that a worker has died, and the master is also being killed.

Of course, a filled-up partition is a catastrophic event, but I think in the logging we might be able to survive that, at least don't die instantly and definitely don't kill the master. It is possible to catch the error, however, if we just simply ignore it, we get a huge memory leak on write requests to FS, so we not only need to catch it, but also destroy the file writing stream in the event of full disk

@Pchelolo awesome! Thank you for testing. Could you also test the scenario where logs are sent to syslog via udp and then fill the partition again? So that we can be informed of the outcomes, make a decision and proceed accordingly.

@Pchelolo awesome! Thank you for testing. Could you also test the scenario where logs are sent to syslog via udp and then fill the partition again? So that we can be informed of the outcomes, make a decision and proceed accordingly.

Sure thing, I didn't create a whole VM for just a single test.

Anyway, I'm preparing a solution for file logging issue, I think that should be fixed anyway, most services do use file logging, and we don't want the whole SCB crash if one service fills up the partition.

Anyway, I'm preparing a solution for file logging issue, I think that should be fixed anyway,

\o/ This is splendid! Thanks a ton!

most services do use file logging, and we don't want the whole SCB crash if one service fills up the partition.

Yup, precisely why I want us to explore the possibilities. One RB host going down because of a filled-up partition makes less of an impact than one SCB dooming in the same way.

So, I've tested the syslog-over-udp solution too, and as we've expected, the service doesn't even notice the filled-up partition.

Also, create a PR that's fixing up file logging issue: https://github.com/wikimedia/service-runner/pull/156

So, the question is whether we want syslog-over-udp with some custom rules to filter our the logs for rest base to a specific location, or fixed and improved file logging is good enough?

I think that with service-runner v2.2.3 we are solid to go for normal file logging.

I flagged the possibility of other write errors (apart from out-of-space) on the PR. I really think we should err on the side of caution, and rule out running out of memory from failed logging in all reasonable cases.

Change 339501 had a related patch set uploaded (by Ppchelko):
Enable local logginng for RESTBase

https://gerrit.wikimedia.org/r/339501

I've put the patch for puppet SWAT on March 02.

Change 339501 merged by Filippo Giunchedi:
[operations/puppet] Enable local logging for RESTBase

https://gerrit.wikimedia.org/r/339501

Something is wrong there. RB is not even creating the file, despite the fact that the directory permissions are correct:

mobrovac@restbase1009:~$ ls -alhF /var/log/restbase/
total 12K
drwxr-xr-x  2 restbase root 4.0K Mar  7 18:03 ./
drwxr-xr-x 19 root     root 4.0K Mar  7 06:25 ../
-rw-r--r--  1 restbase root 1.1K Mar  7 18:38 syslog.log

The file creation does not seem to be the problem here; I manually touched main.log as the restbase user and restarted the service, but it is still not writing to it:

root@restbase1009:~# ls -alhF /var/log/restbase/
total 12K
drwxr-xr-x  2 restbase root     4.0K Mar  7 18:53 ./
drwxr-xr-x 19 root     root     4.0K Mar  7 06:25 ../
-rw-r--r--  1 restbase restbase    0 Mar  7 18:53 main.log
-rw-r--r--  1 restbase root     2.3K Mar  7 18:53 syslog.log

Ok, it turns out the problem is that firejail doesn't have /var/log/restbase whitelisted. RB is actually logging stuff:

root@restbase1009:~# firejail --list
22774:restbase:/usr/bin/firejail --blacklist=/root --blacklist=/home --caps --seccomp /usr/bin/nodejs restbase/server.js -c /etc/restbase/config.yaml 
root@restbase1009:~# firejail --join=22774
Switching to pid 22775, the first child process inside the sandbox
Child process initialized
root@restbase1009:~# ls -alhF /var/log/restbase
total 72K
drwxr-xr-x  2 restbase root      60 Mar  7 18:53 ./
drwxr-xr-x 19 root     root     420 Mar  7 18:53 ../
-rw-r--r--  1 restbase restbase 72K Mar  7 20:13 main.log

Ok, after losing half of a day on this, I realised that using /var/log is not going to fly with firejail. It explicitly [mounts an empty /var/log](https://github.com/netblue30/firejail/blob/0.9.44.6/src/firejail/fs_var.c#L110-L148) and never exposes it.

Hence, we are left with these options:

  1. use /srv/log
  2. use an alternative location that is not on /srv
  3. no local logging

The last one doesn't seem like an option, really. I could go for (1) as long as we have proper log rotation and removal in place. Option (2) may sound appealing, but randomly scattering the log files over the disk does not sound future-proof to me at all.

How about going back to syslog over UDP, or capturing stdout / stderr in systemd?

Change 342103 had a related patch set uploaded (by Mobrovac):
[operations/puppet] RESTBase: Send the logs locally to stdout/syslog

https://gerrit.wikimedia.org/r/342103

no movement on this for a while, is this work still ongoing?

Change 342103 abandoned by Mobrovac:
RESTBase: Send the logs locally to stdout/syslog

Reason:
Rendered obsolete by I88a7bb4a0bb686cf78979dfca7a7da0a50c82c73

https://gerrit.wikimedia.org/r/342103

mobrovac claimed this task.
mobrovac removed a project: Patch-For-Review.

Yup. RESTBase is logging locally as well. Resolving.