From alert in alertmanager (https://alerts.wikimedia.org/?q=team%3Dwmcs)
ToolsToolsDBReplicationLagIsTooHigh project: tools summary: ToolsDB replication on tools-db-2 is lagging behind the primary, the current lag is 59756 16 hours agoinstance: tools-db-2 job: toolsdb-mariadb master_host: tools-db-1.tools.eqiad1.wikimedia.cloud team: wmcs @cluster: wmcloud.org
It seems to be stuck on a delete:
dcaro@urcuchillay$ wmcs-cookbooks wmcs.toolforge.toolsdb.get_current_replica_transaction --task-id T357264 Got matching cookbooks wmcs.toolforge.toolsdb.get_current_replica_transaction START - Cookbook wmcs.toolforge.toolsdb.get_current_replica_transaction Skipping not-active replica node tools-db-3.tools.eqiad1.wikimedia.cloud: NodeStatus(fqdn='tools-db-3.tools.eqiad1.wikimedia.cloud', nodeid='Unknown', replication_state=ReplicationState(status='Unknown'), host_status='Up', mariadb_status='Stopped(inactive-dead)') ########################################################################### Replica: {replica_name} Suspicious tables: Table_map: `s51698__yetkin`.`visited_pages_agg` mapped Suspicious queries: #Q> DELETE FROM visited_pages_agg WHERE vpa_year = '2024' AND vpa_month = '2' AND vpa_day = '10' END (PASS) - Cookbook wmcs.toolforge.toolsdb.get_current_replica_transaction (exit_code=0)