Page MenuHomePhabricator

Setup monitoring for database servers in beta cluster
Open, Stalled, LowPublic

Description

part of making sure that everything in the MW pipeline is monitored in beta as well.

Event Timeline

yuvipanda claimed this task.
yuvipanda raised the priority of this task from to Medium.
yuvipanda updated the task description. (Show Details)
yuvipanda added subscribers: greg, scfc, Krinkle and 8 others.
greg renamed this task from Setup monitoring for database servers in betalabs to Setup monitoring for database servers in beta cluster.Mar 10 2015, 8:47 PM
greg set Security to None.
hashar added subscribers: Ryasmeen, Shizhao, thcipriani, Krenair.

From T97120

The beta cluster MySQL servers turned out to be down for a few hours (T96905) and there is no monitoring for it.

We would need on both instances (deployment-db1 and deployment-db2) a check to ensure the mysql process is running.

The command line looks like:

/usr/sbin/mysqld --basedir=/usr --datadir=/mnt/sqldata \
  --plugin-dir=/usr/lib/mysql/plugin --user=mysql \
  --log-error=/mnt/sqldata/deployment-db1.err \
  --pid-file=/mnt/sqldata/deployment-db1.pid \
  --socket=/tmp/mysql.sock --port=3306

I guess we can just monitor whether /usr/bin/mysqld is present.

hashar lowered the priority of this task from Medium to Low.Jun 15 2015, 7:17 PM

Per beta cluster weekly triage:

The MySQL databases only got down a couple times over 4 years and we quickly noticed it when it happened. Lack of monitoring is surely annoying but is not that much of a big deal, hence lowering priority.

hashar changed the task status from Open to Stalled.Oct 30 2015, 10:51 PM
Aklapper changed the task status from Stalled to Open.May 19 2020, 4:00 PM

The previous comments don't explain what/who exactly this task is stalled on ("If a report is waiting for further input (e.g. from its reporter or a third party) and can currently not be acted on"). Hence resetting task status.

If this task should not be worked on and fixing this is not worth the efforts, then task status should have the "Declined" status.)

greg changed the task status from Open to Stalled.May 19 2020, 4:13 PM

Reflecting reality of team resourcing.