Create alerts for rsyslog rate limiting
Closed, DeclinedPublic
Actions

Assigned To

None

Authored By

	Ejegg
	Sep 28 2017, 12:08 AM

Description

We need those log lines - if we're skipping some, we should know ASAP.

Related Objects
Search...

Status	Assigned	Task
Resolved	Jgreen	T91508 [Epic] overhaul fundraising cluster monitoring
Invalid	None	T197892 fundraising monitoring fixes (EPIC)
Declined	None	T176924 Create alerts for rsyslog rate limiting

Event Timeline

Ejegg created this task.Sep 28 2017, 12:08 AM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 28 2017, 12:08 AM

Jgreen added a parent task: T91508: [Epic] overhaul fundraising cluster monitoring.Oct 2 2017, 3:36 PM

As of this AM we're collecting syslog total message count, and dropped message count, to prometheus. They're on the main fundraising dashboard here: https://grafana.wikimedia.org/dashboard/db/fundraising-overview?refresh=1m&orgId=1. We haven't figured out alerting from prometheus/grafana yet, but that's the logical next step.

In T176924#3651256, @Jgreen wrote:

As of this AM we're collecting syslog total message count, and dropped message count, to prometheus. They're on the main fundraising dashboard here: https://grafana.wikimedia.org/dashboard/db/fundraising-overview?refresh=1m&orgId=1. We haven't figured out alerting from prometheus/grafana yet, but that's the logical next step.

Also at this stage it would be impractical to alert because the alarm would be going off constantly for the civicrm host, where queue consumers are still pushing too much log traffic.

Darn, I'd hoped the Civi host would be fine after cutting the message rate in half, but it looks like there are still spikes of rate-limiting

Jgreen changed the task status from Open to Stalled.May 24 2018, 3:39 PM

Jgreen added a parent task: T197892: fundraising monitoring fixes (EPIC).Jun 21 2018, 6:45 PM

We're no longer running into this after some serious rsyslog tuning.

Dwisehaupt moved this task from Triage to Done on the fundraising-tech-ops board.Feb 13 2020, 10:03 PM

Create alerts for rsyslog rate limitingClosed, DeclinedPublicActions

Description

Related ObjectsSearch...

Event Timeline

Create alerts for rsyslog rate limiting
Closed, DeclinedPublic
Actions

Related Objects
Search...