Page MenuHomePhabricator

Cron conflict for kafkatee logrotate on oxygen
Closed, ResolvedPublic

Description

Cron error was:

/etc/cron.daily/logrotate:
error: error running non-shared postrotate script for /var/cache/kafkatee/kafkatee.stats.json of '/var/cache/kafkatee/kafkatee.stats.json '
run-parts: /etc/cron.daily/logrotate exited with return code 1

From a quick look at the files, seems that we have 2 issues:

  1. /etc/logrotate.d/kafkatee is in charge of rotating /var/cache/kafkatee/kafkatee.stats.json and issue a reload of kafkatee
  2. /etc/logrotate.d/kafkatee-webrequest is in charge of rotating many other kafkatee files and issue a reload of kafkatee. Moreover it doesn't have the sharedscripts directive for logrotate so I think it's issuing a reload for each rotated file.

My educated guess is that moving the path /var/cache/kafkatee/kafkatee.stats.json into the /etc/logrotate.d/kafkatee-webrequest configuration and using the sharedscripts directive should solve the issue.

Details

Related Gerrit Patches:

Event Timeline

Volans created this task.Nov 28 2016, 10:48 AM

Thanks! I already tried to fix another similar issue for kafkatee+oxygen in T132322, mentioning it since I might be the root cause :)

elukey added a comment.EditedDec 12 2016, 6:55 AM
elukey@oxygen:~$ sudo invoke-rc.d kafkatee reload
elukey@oxygen:~$ echo $?
102

 102    Subsystem error.  Init script (or policy layer) subsystem malfunction. Also, forced init script execution due to --try-anyway or --force failed.

elukey@oxygen:~$ sudo service kafkatee reload
elukey@oxygen:~$ echo $?
0

So /etc/logrotate.d/kafkatee uses invoke-rc.d kafkatee reload and returns 102, meanwhile /etc/logrotate.d/kafkatee-webrequests uses service kafkatee reload and returns 0.

I agree with Riccardo, we'd probably need to merge the two rotate scripts. The kafkatee is shipped with a logrotate script, so we'd need to remove it from there and only let puppet do the work:

elukey@oxygen:~$ dpkg -S /etc/logrotate.d/kafkatee
kafkatee: /etc/logrotate.d/kafkatee

Myself from the past created T145490

Change 354223 had a related patch set uploaded (by Elukey; owner: Elukey):
[analytics/kafkatee@master] Remove logrotate and syslog configuration

https://gerrit.wikimedia.org/r/354223

Mentioned in SAL (#wikimedia-operations) [2017-05-18T13:14:19Z] <elukey> reloaded kafkatee to test T151748

elukey moved this task from Backlog to Analytics Backlog on the User-Elukey board.May 18 2017, 4:42 PM

Change 354223 merged by Elukey:
[analytics/kafkatee@master] Remove logrotate and syslog configuration

https://gerrit.wikimedia.org/r/354223

Mentioned in SAL (#wikimedia-operations) [2017-06-30T09:54:21Z] <elukey> uploaded kafkatee 0.1.6-1 to reprepro - T151748

Change 362382 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet/kafkatee@master] Replace 'invoke-rc.d' with 'service' in logrotate config

https://gerrit.wikimedia.org/r/362382

Change 362382 merged by Elukey:
[operations/puppet/kafkatee@master] Replace 'invoke-rc.d' with 'service' in logrotate config

https://gerrit.wikimedia.org/r/362382

elukey moved this task from In Progress to Done on the User-Elukey board.Jun 30 2017, 12:15 PM

Let's wait the regular weekly rotates to happen before calling this a win :)

elukey closed this task as Resolved.Jul 3 2017, 8:01 AM
elukey claimed this task.

The bug seems resolved, but for posterity it seems to me that https://gerrit.wikimedia.org/r/362382 would have been enough to fix. https://gerrit.wikimedia.org/r/362382 has been deployed but it might be reverted in the future if we think that the kafkatee deb package will need to have default logrotate/syslog configurations.