Page MenuHomePhabricator

Decide on owner and contact for "Cross-DC database query" alert
Open, Needs TriagePublic

Description

Scope:

  • Code instrumentation with statsd (currently in mediawiki-core/rdbms component):
  • Grafana dashboard: Performance Team / Cross-DC Database Query alert
  • Alert contact: Currently performance-team@wikimedia.org and Libera Chat #wikimedia-perf-bots.

Background:
Launch task is T258125: Provide train-blocking alerts for cross-DC mysql traffic spikes

Question:

As part of the July 2023 reorg, the Performance Team no longer exists. While some of the long-term initiatives live on as "Wikimedia Performance" (led by MediaWiki Engineering, with support from SRE Observability and QTE), for specific code stewardship and incidence responses, we need explicit resourcing and ownership.

The responsibility to maintain cross-dc database stats, set an alert threshold, and respond to alerts, is an example of a small but explicit responsibility that has fallen through the cracks.

I've turned off this alert as of today, Friday 13 October 2023, thus it is no longer notifies anyone and is no longer responded to.

Event Timeline

I've followed up with a few folks on this in SRE, ultimately I need to discuss further with Mark Bergsma which is happening on Tuesday Dec 12th as of right now.

After meeting with Mark we did not land on a clear owner for this as of yet. I'm still looking at a couple options and will update this ticket as I have more information.