Page MenuHomePhabricator

set up a cron job that watches dumps logs for exceptions and reports periodically
Closed, ResolvedPublic0 Estimated Story Points

Description

We get emails for some failures, and kibana catches others. There's an in between category of MediaWiki failures that get logged to disk and we should keep track of those. A cron job that runs every so many hours, checking dump run logs for recent exception information, would be useful. It would provide more information than the failure emails and could give advance notice of regressions.

Event Timeline

ArielGlenn triaged this task as Medium priority.Aug 8 2019, 7:00 AM
ArielGlenn created this task.

Change 528995 had a related patch set uploaded (by ArielGlenn; owner: ArielGlenn):
[operations/puppet@production] look at dumps logs every so often for exceptions and report them

https://gerrit.wikimedia.org/r/528995

Change 528995 merged by ArielGlenn:
[operations/puppet@production] look at dumps logs every so often for exceptions and report them

https://gerrit.wikimedia.org/r/528995

Change 529356 had a related patch set uploaded (by ArielGlenn; owner: ArielGlenn):
[operations/puppet@production] dump exception checker uses python rather than bash

https://gerrit.wikimedia.org/r/529356

Change 529356 merged by ArielGlenn:
[operations/puppet@production] dump exception checker uses python rather than bash

https://gerrit.wikimedia.org/r/529356

This appears to be working as it should. Closing.