While investigating T102396, it became obvious that our icinga log rotation wiped out a large timing gap from the icinga.log file when rotating it to icinga.log.1, losing significant alert/failure history information. This should be fixed!
Description
Description
Related Objects
Related Objects
Event Timeline
Comment Actions
see also T7: Get icinga alerts into logstash about sending icinga alerts to logstash, not sure if we'd benefit from the complete icinga history though
Comment Actions
I think this got fixed with the icinga upgrade/refactor
einsteinium:/var/log/icinga$ head -3 icinga.log [1480487103] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;backup4001;check_swap;0;SWAP OK - 100% free (15249 MB out of 15249 MB) |swap=15249MB;14487;13724;0;15249 [1480487103] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;fdb2001;check_zombie;0;PROCS OK: 0 processes with STATE = Z | procs=0;5;50;0; [1480487103] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;saiph;check_puppetrun;0;OK: Puppet is currently enabled, last run 1029 seconds ago with 0 failures einsteinium:/var/log/icinga$ tail -3 icinga.log.1 [1480487096] Return code of 255 is out of bounds [1480487096] Return code of 255 is out of bounds [1480487096] Return code of 255 is out of bounds einsteinium:/var/log/icinga$