Page MenuHomePhabricator

Cassandra: Work-around broken JVM GC log rotation
Closed, DeclinedPublic

Description

The JVM writes to gc.log.{n}.current (where n=0 at startup) until the file reaches -XX:GCLogFileSize MB in size, at which point it begins writing to gc.log.{n+1}.current, and renames gc.log.{n}.current to gc.log.{n}. This is really stupid because when the JVM restarts, n is reset to 0 and the logs are again laid down chronologically from {n...n+(-XX:NumberOfGCLogFiles-1)}, overwriting the more recent historical logs first.

It should be possible to work-around some of this stupidity with a script that rearranges/reorders logs files prior to starting Cassandra.

Event Timeline

Could maybe do something with logrotate that runs pretty often? Or gets notified when the JVM stops and manually rotates?

Could maybe do something with logrotate that runs pretty often? Or gets notified when the JVM stops and manually rotates?

Unless there is some way to signal the JVM to start a new log, then I assume this will have to be something that runs when the JVM stops (or as a precondition to starting).

Could maybe do something with logrotate that runs pretty often? Or gets notified when the JVM stops and manually rotates?

Unless there is some way to signal the JVM to start a new log, then I assume this will have to be something that runs when the JVM stops (or as a precondition to starting).

No way I can find (documented, at least). I was thinking of not setting NumberOfGCLogFiles at all (infinite retention) and letting logrotate handle the rotation for us. Prior to starting the JVM, we could have an exec{} that calls logrotate --force which should rotate regardless of age/size/etc parameters.

Could maybe do something with logrotate that runs pretty often? Or gets notified when the JVM stops and manually rotates?

Unless there is some way to signal the JVM to start a new log, then I assume this will have to be something that runs when the JVM stops (or as a precondition to starting).

No way I can find (documented, at least). I was thinking of not setting NumberOfGCLogFiles at all (infinite retention) and letting logrotate handle the rotation for us. Prior to starting the JVM, we could have an exec{} that calls logrotate --force which should rotate regardless of age/size/etc parameters.

Wouldn't the log then grow unbounded in the meantime?

Soooo, after a lot of digging @ArielGlenn and I found a completely undocumented but wonderfully elegant solution that does exactly what we want.

-Xloggc is way more useful than it used to be. With the addition of this patch to openjdk, you can include parameters in the log file name, namely %t for the timestamp and %p for the pid. This patch is in all versions of openjdk-8, and appears to have been backported to 7u76, 7u79 and 7u80. Gerrit is running 7u111 on jessie.

Haven't tested yet, but introducing a timestamp or PID would alleviate the problem of not starting a new log. I dunno how this affects retention, so we might still need logrotate to come along (although --force wouldn't be needed and could run at normal intervals instead)

This comment was removed by ArielGlenn.

Woops, that was intended for gerrit, heh. (So removed)

option verified to exist on restbase hosts:

ariel@restbase1007:~$ java -Xloggc:'/home/ariel/stuff%p.log' -version
openjdk version "1.8.0_102"
OpenJDK Runtime Environment (build 1.8.0_102-8u102-b14.1-1~bpo8+1-b14)
OpenJDK 64-Bit Server VM (build 25.102-b14, mixed mode)
ariel@restbase1007:~$ ls -l /home/ariel/stuffpid144216.log
-rw-r--r-- 1 ariel wikidev 418 Oct 25 14:29 /home/ariel/stuffpid144216.log

Thanks for looking at this!

option verified to exist on restbase hosts:

ariel@restbase1007:~$ java -Xloggc:'/home/ariel/stuff%p.log' -version
openjdk version "1.8.0_102"
OpenJDK Runtime Environment (build 1.8.0_102-8u102-b14.1-1~bpo8+1-b14)
OpenJDK 64-Bit Server VM (build 25.102-b14, mixed mode)
ariel@restbase1007:~$ ls -l /home/ariel/stuffpid144216.log
-rw-r--r-- 1 ariel wikidev 418 Oct 25 14:29 /home/ariel/stuffpid144216.log

A PID here seems of kind of opaque. Timestamps would be better, but unfortunately (and unless I'm misreading the code), the %t substitution happens exactly once at startup; This timestamp is JVM start time. So, whether you're using a PID or timestamp, rotation is still going to round-robin and double-back to overwrite the lower-indexed files.

While that is still confusing/surprising, it would solve the problem of a JVM restart immediately clobbering the lower-indexed files. And it requires that we do something external to the JVM for rotation (and it's not clear to me what that would be).

A PID here seems of kind of opaque. Timestamps would be better, but unfortunately (and unless I'm misreading the code), the %t substitution happens exactly once at startup; This timestamp is JVM start time. So, whether you're using a PID or timestamp, rotation is still going to round-robin and double-back to overwrite the lower-indexed files.

While that is still confusing/surprising, it would solve the problem of a JVM restart immediately clobbering the lower-indexed files. And it requires that we do something external to the JVM for rotation (and it's not clear to me what that would be).

Anytime the service is restarted we'll get a new PID (or timestamp, if you'd prefer), which keeps us from starting the overwrite immediately--although you're completely right that we'll eventually round-robin if the process runs long enough.

What I think would be best is to setup some logrotate rules along with removing the NumberOfGCLogFiles option. This will handle gzip/removal and stop the round-robin problem for the most part--it could still happen if you have a log explosion prior to logrotate's cron running, but that seems like an edge case here rather than to be optimized for.

What I think would be best is to setup some logrotate rules along with removing the NumberOfGCLogFiles option. This will handle gzip/removal and stop the round-robin problem for the most part--it could still happen if you have a log explosion prior to logrotate's cron running, but that seems like an edge case here rather than to be optimized for.

Ok, so the files are going to be laid down in increasing order, with the active/open filename suffixed with .current. For example:

foolog-%t.%p.0          <-- oldest, was foolog-%t.%p.0.current
foolog-%t.%p.1          <-- was foolog-%t.%p.1.current
foolog-%t.%p.2          <-- was foolog-%t.%p.2.current
foolog-%t.%p.3.current  <-- newest, JVM is still writing here

Also, any JVM restart(s) that happen between logrotate runs are going to leave some sequence(s) of files with a past timestamp and/or pid expansions, one of which from each sequence will be suffixed with .current.

I'm trying to match up my understanding of how logrotate can be applied, with a workable solution for this, and it's not coming up awesome. Do you have ideas of how this might look?

GWicke triaged this task as Medium priority.Aug 8 2017, 9:41 PM