The replica tools-db-2 is currently lagging 2.7 hours behind the primary tools-db-1 (Grafana chart). This is (correctly) triggering the alert ToolsToolsDBReplicationLagIsTooHigh.
SHOW SLAVE STATUS\g in the replica shows the replication is active, but it's taking hours to process a single transaction.
This happened before (T341891, T338031, T343819). The replica usually catches up after a couple of days, without any intervention.
I followed the runbook and found the database, the table and the query causing the issue, it's s54113__spacemedia.dvids_video_files and the query is:
UPDATE dvids_video_files SET dvids_video_repo_id = 'video' WHERE dvids_video_repo_id = '5'
I have acked the alert until Monday, hoping that this will resolve by itself. On Monday I will check if it's possible to add an index to that table to prevent this from happening in the future.