Hourly log rotation for large MW logs
Open, MediumPublic


The daily API log is about 211GB uncompressed, 43GB compressed, and as such it takes a while to find the log entry you're looking for, even with an exact timestamp. T275101 proposes to add a new API log, perhaps with some deduplication of fields between the two logs, which will increase the need for finding correlating log entries. To make this quicker, I propose that we move to hourly rotation for api.log.

In terms of configuration files on mwlog1001:

  • Duplicate the existing section in /etc/logrotate.d/udp2log-mw . Change the filename to /srv/mw-log/api.log, change daily to hourly. I suggest also datehourago -- really we should have had dateyesterday from the outset, and maybe that's a good idea for when we switch to mwlog1002 (T224565), but it's easy enough to change here. The specific section for api.log needs to come after the wildcard. That's simplest if they are in the same file.
  • Move /etc/cron.daily/logrotate to /etc/cron.hourly/logrotate

But I'm not sure how to puppetize that. /etc/logrotate.d/udp2log-mw comes from a shared template in the udp2log module, and I'm not sure of the correct way to rename a cron file. After T275101 there will be at least two affected log files, so it should be possible to specify a list of files that have hourly rotation.

Note that the date format in the archive filenames can be customised with dateformat. The default is -%Y%m%d%H.