Page MenuHomePhabricator

Create alerts for rsyslog rate limiting
Closed, DeclinedPublic

Description

We need those log lines - if we're skipping some, we should know ASAP.

Related Objects

Event Timeline

As of this AM we're collecting syslog total message count, and dropped message count, to prometheus. They're on the main fundraising dashboard here: https://grafana.wikimedia.org/dashboard/db/fundraising-overview?refresh=1m&orgId=1. We haven't figured out alerting from prometheus/grafana yet, but that's the logical next step.

As of this AM we're collecting syslog total message count, and dropped message count, to prometheus. They're on the main fundraising dashboard here: https://grafana.wikimedia.org/dashboard/db/fundraising-overview?refresh=1m&orgId=1. We haven't figured out alerting from prometheus/grafana yet, but that's the logical next step.

Also at this stage it would be impractical to alert because the alarm would be going off constantly for the civicrm host, where queue consumers are still pushing too much log traffic.

Darn, I'd hoped the Civi host would be fine after cutting the message rate in half, but it looks like there are still spikes of rate-limiting

Jgreen changed the task status from Open to Stalled.May 24 2018, 3:39 PM

We're no longer running into this after some serious rsyslog tuning.