Page MenuHomePhabricator

Add more constraints to alerting on failed backups
Open, Needs TriagePublic

Description

This is a follow-up on T314852: SQL logical backups appear to be failing

The backups wasn't taken for more than a week but we never found out. This could probably have been spotted by adding more constraints to the alert we have that checks if a backup was taken.

AC

  • the check that backups have been taken IS NOT only done by a single log output.

Useful links:

Useful metrics:

  • storage.googleapis.com/storage/total_byte_seconds (resource type: gcs_bucket)
  • storage.googleapis.com/storage/total_bytes (resource type: gcs_bucket)