Page MenuHomePhabricator

icinga log rotation wipes out portions of history
Closed, InvalidPublic

Description

While investigating T102396, it became obvious that our icinga log rotation wiped out a large timing gap from the icinga.log file when rotating it to icinga.log.1, losing significant alert/failure history information. This should be fixed!

Event Timeline

BBlack raised the priority of this task from to Medium.
BBlack updated the task description. (Show Details)
BBlack added a project: acl*sre-team.
BBlack added subscribers: BBlack, mark.

see also T7: Get icinga alerts into logstash about sending icinga alerts to logstash, not sure if we'd benefit from the complete icinga history though

I think this got fixed with the icinga upgrade/refactor

einsteinium:/var/log/icinga$ head -3 icinga.log
[1480487103] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;backup4001;check_swap;0;SWAP OK - 100% free (15249 MB out of 15249 MB) |swap=15249MB;14487;13724;0;15249
[1480487103] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;fdb2001;check_zombie;0;PROCS OK: 0 processes with STATE = Z | procs=0;5;50;0;
[1480487103] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;saiph;check_puppetrun;0;OK: Puppet is currently enabled, last run 1029 seconds ago with 0 failures
einsteinium:/var/log/icinga$ tail -3 icinga.log.1 
[1480487096] Return code of 255 is out of bounds
[1480487096] Return code of 255 is out of bounds
[1480487096] Return code of 255 is out of bounds
einsteinium:/var/log/icinga$