Page MenuHomePhabricator

Rotate MediaWiki-Vagrant log files to prevent logs from eating up all disk space
Open, Needs TriagePublic

Description

If you use vagrant, soon enough you'll run out of disk space because of huge log files, some of which are not rotated at all, some are rotated but kept for too long.

Worst offenders:

  • /var/log/syslog and /var/log/daemon.log (rotated daily but even so grow to gigabytes)
  • /vagrant/logs/restbase.log and /vagrant/logs/changeprop.log (have rotation rules but they don't seem to work; can grow by 1M/s when something is broken, even if the wiki is not used at all; most people only have them as a dependency and don't care about the logs)

Less bad but still not great:

  • mediawiki-wiki-debug.log (rotated daily but very verbose so it can still grow large)
  • /vagrant/logs/xdebug.log (not rotated)
  • /var/log/messages, /var/log/kern.log (rotated but still grows to hundreds of megs)

None of these have much value going back more than a day, if not hours.

Event Timeline

Mattflaschen-WMF renamed this task from More aggressive log rotation for daemon.log and syslog.1 to save disk space to More aggressive log rotation for daemon.log and syslog to save disk space.Apr 28 2017, 11:02 PM
Tgr renamed this task from More aggressive log rotation for daemon.log and syslog to save disk space to Rotate MediaWiki-Vagrant log files to prevent logs from eating up all disk space.Dec 24 2019, 12:32 AM
Tgr updated the task description. (Show Details)
Tgr added a subscriber: Tgr.

Made this a little more generic.

Extra problematic on semi-production setups like Wikispore.

Tgr moved this task from Backlog to Next-up on the Wikispore board.

As it turns out the logrotate config is actually not unreasonable, but my /var/lib/logrotate/status has a bunch of 2296-5-30-1:6:22 dates:

logrotate state -- version 2
"/vagrant/logs/mediawiki-enwiki-debug.log" 2296-5-30-1:6:22
"/var/log/syslog" 2296-5-30-1:6:22
"/var/log/dpkg.log" 2019-8-1-0:5:33
"/vagrant/logs/mediawiki-exception.log" 2296-5-30-1:6:22
"/var/log/auth.log" 2296-5-30-1:6:22
"/vagrant/logs/mediawiki-frwiktionarywiki-debug.log" 2296-5-30-1:6:22
"/var/log/apt/term.log" 2019-8-1-0:5:33
"/var/log/apt/history.log" 2019-8-1-0:5:33
"/var/log/mysql/error.log" 2296-5-30-1:6:22
"/var/log/alternatives.log" 2019-8-1-0:5:33
"/vagrant/logs/mediawiki-loginwiki-debug.log" 2296-5-30-1:6:22
"/var/log/debug" 2019-8-20-11:11:33
"/var/log/mail.log" 2019-8-20-11:11:33
"/vagrant/logs/mediawiki-zhwiki-debug.log" 2296-5-30-1:6:22
"/var/log/kern.log" 2296-5-30-1:6:22
"/var/log/redis/redis-server.log" 2296-5-30-1:6:22
"/var/log/apache2/access.log" 2019-7-15-12:0:0
"/var/log/wtmp" 2296-5-30-1:6:22
"/var/log/daemon.log" 2296-5-30-1:6:22
"/var/log/mail.warn" 2019-7-23-7:16:20
"/var/log/btmp" 2296-5-30-1:6:22
"/vagrant/logs/mediawiki-frwiki-debug.log" 2296-5-30-1:6:22
"/vagrant/logs/mediawiki-centralauthtestwiki-debug.log" 2296-5-30-1:6:22
"/vagrant/logs/mediawiki-hewiki-debug.log" 2296-5-30-1:6:22
"/var/log/user.log" 2296-5-30-1:6:22
"/vagrant/logs/mediawiki-ruwiki-debug.log" 2296-5-30-1:6:22
"/var/log/mail.info" 2019-8-20-11:11:33
"/var/log/apache2/other_vhosts_access.log" 2019-7-15-12:0:0
"/var/log/apache2/error.log" 2019-7-22-9:41:34
"/var/log/exim4/mainlog" 2019-7-22-9:41:34
"/vagrant/logs/mediawiki-wiki-debug.log" 2296-5-30-1:6:22
"/vagrant/logs/mediawiki-zhwikivoyagewiki-debug.log" 2296-5-30-1:6:22
"/var/log/messages" 2296-5-30-1:6:22
"/vagrant/logs/mediawiki-runJobs.log" 2296-5-30-1:6:22
"/vagrant/logs/mediawiki-mobilewiki-debug.log" 2296-5-30-1:6:22
"/vagrant/logs/mediawiki-eswiki-debug.log" 2296-5-30-1:6:22

and that prevents logrotate from running (it just dies with error: bad year 2296 - apparently that's logrotate/logrotate#247). I'd ascribe it to my setup being messed up, except I have recently seen another freshly set-up box fill up in a few days.

In any case, deleting the status file and running logrotate manually (sudo logrotate -f /etc/logrotate.conf) freed up ~5G space and things seem to be working again.

Probably still worth setting up more-frequent-than-daily rotation for the services logfiles.

Here's a logrotate statusfile from another VM (on the same laptop):

logrotate state -- version 2
"/vagrant/logs/mediawiki-enwiki-debug.log" 2296-6-25-9:12:3
"/var/log/syslog" 2296-6-25-9:12:3
"/var/log/elasticsearch/elasticsearch.log" 2019-3-20-7:0:0
"/var/log/dpkg.log" 2019-7-10-19:17:1
"/var/log/auth.log" 2296-6-25-9:12:3
"/vagrant/logs/mediawiki-frwiktionarywiki-debug.log" 2296-6-25-9:12:3
"/var/log/apt/term.log" 2019-7-10-19:17:1
"/var/log/apt/history.log" 2019-7-10-19:17:1
"/var/log/mysql/error.log" 2296-6-25-9:12:3
"/var/log/alternatives.log" 2019-9-15-14:17:1
"/vagrant/logs/mediawiki-loginwiki-debug.log" 2296-6-25-9:12:3
"/var/log/debug" 2019-9-15-10:51:32
"/var/log/mail.log" 2019-9-15-10:51:32
"/vagrant/logs/mediawiki-zhwiki-debug.log" 2296-6-25-9:12:3
"/var/log/kern.log" 2296-6-25-9:12:3
"/var/log/redis/redis-server.log" 2296-6-25-9:12:3
"/var/log/apache2/access.log" 2019-3-20-7:0:0
"/var/log/wtmp" 2296-6-25-9:12:3
"/var/log/daemon.log" 2296-6-25-9:12:3
"/var/log/mail.warn" 2019-9-15-10:51:32
"/var/log/btmp" 2296-6-25-9:12:3
"/var/log/elasticsearch/elasticsearch_deprecation.log" 2019-3-20-7:0:0
"/vagrant/logs/mediawiki-frwiki-debug.log" 2296-6-25-9:12:3
"/var/log/elasticsearch/elasticsearch_index_indexing_slowlog.log" 2019-3-20-7:0:0
"/vagrant/logs/mediawiki-centralauthtestwiki-debug.log" 2296-6-25-9:12:3
"/vagrant/logs/mediawiki-hewiki-debug.log" 2296-6-25-9:12:3
"/var/log/user.log" 2296-6-25-9:12:3
"/vagrant/logs/mediawiki-ruwiki-debug.log" 2296-6-25-9:12:3
"/var/log/mail.info" 2019-9-15-10:51:32
"/var/log/apache2/other_vhosts_access.log" 2019-3-20-7:0:0
"/var/log/apache2/error.log" 2019-3-21-0:17:1
"/var/log/exim4/mainlog" 2019-3-21-0:17:1
"/vagrant/logs/mediawiki-wiki-debug.log" 2296-6-25-9:12:3
"/vagrant/logs/mediawiki-zhwikivoyagewiki-debug.log" 2296-6-25-9:12:3
"/var/log/elasticsearch/elasticsearch_index_search_slowlog.log" 2019-3-20-7:0:0
"/var/log/messages" 2296-6-25-9:12:3
"/vagrant/logs/mediawiki-mobilewiki-debug.log" 2296-6-25-9:12:3
"/vagrant/logs/mediawiki-eswiki-debug.log" 2296-6-25-9:12:3

Same type of error, different date. The difference between the last valid date and the bogus one is exactly 101,091 days in both cases. Strange...

Another instance: 2297-1-7-14:49:46 and 2020-3-29-21:17:1. Again 101,091 days difference. So somehow sometimes logrotate logs a timestamp 101,091 days int the future into its status file, and that breaks all further log rotation attempts.

Change 605615 had a related patch set uploaded (by Gergő Tisza; owner: Gergő Tisza):
[mediawiki/vagrant@master] Delete logrotate status file periodically

https://gerrit.wikimedia.org/r/605615

Change 605615 merged by jenkins-bot:
[mediawiki/vagrant@master] Delete logrotate status file when it has errors

https://gerrit.wikimedia.org/r/c/mediawiki/vagrant/ /605615

Another instance: 2297-1-7-14:49:46 and 2020-3-29-21:17:1. Again 101,091 days difference. So somehow sometimes logrotate logs a timestamp 101,091 days int the future into its status file, and that breaks all further log rotation attempts.

The nasty workaround for that has been merged. The proper fix will be T256822: [EPIC] Upgrade MediaWiki-Vagrant to Debian Buster as that upgrades logrotate to a version where this bug is fixed.