Backup sources
- db1150
- db1171
- db1216
- db1225
- db1239
- db1240
- db1245
- db2139 - to be decommed
- db2141
- db2197
- db2198
- db2199
- db2200
- db2201
- db2239
backup1 hosts
- db1204
- db1205
- db2183
- db2184
Change #1112184 had a related patch set uploaded (by Jcrespo; author: Jcrespo):
[operations/puppet@production] installserver: Review backup and db hosts
Change #1112184 merged by Jcrespo:
[operations/puppet@production] installserver: Review backup and db hosts
Icinga downtime and Alertmanager silence (ID=2ec27167-237c-4fd7-9ccb-4486e0a3234c) set by jynus@cumin2002 for 1 day, 0:00:00 on 1 host(s) and their services with reason: reimage
db2141.codfw.wmnet
Cookbook cookbooks.sre.hosts.reimage was started by jynus@cumin2002 for host db2141.codfw.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by jynus@cumin2002 for host db2141.codfw.wmnet with OS bookworm completed:
Icinga status is not optimal, downtime not removed
That was because I am rebuilding the tables and thus replication is stopped.
Icinga downtime and Alertmanager silence (ID=d7245c24-f67b-4f43-ae17-b2ef80f610ed) set by jynus@cumin1002 for 2:00:00 on 1 host(s) and their services with reason: os upgrade
db1245.eqiad.wmnet
Cookbook cookbooks.sre.hosts.reimage was started by root@cumin1002 for host db1245.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by root@cumin1002 for host db1245.eqiad.wmnet with OS bookworm completed:
Change #1112802 had a related patch set uploaded (by Jcrespo; author: Jcrespo):
[operations/puppet@production] dbbackups: Remove set user permissions from m1 backup user grants
Icinga downtime and Alertmanager silence (ID=f63b7dc3-cd57-40da-aafb-d98d09fe8ad8) set by jynus@cumin1002 for 1 day, 0:00:00 on 1 host(s) and their services with reason: os upgrade
db1240.eqiad.wmnet
Cookbook cookbooks.sre.hosts.reimage was started by root@cumin1002 for host db1240.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by root@cumin1002 for host db1240.eqiad.wmnet with OS bookworm completed:
Icinga downtime and Alertmanager silence (ID=8324c4ec-57d8-4460-ab4e-364dd23824da) set by jynus@cumin1002 for 2:00:00 on 1 host(s) and their services with reason: os upgrade
db1205.eqiad.wmnet
Cookbook cookbooks.sre.hosts.reimage was started by root@cumin1002 for host db1205.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by root@cumin1002 for host db1205.eqiad.wmnet with OS bookworm completed:
Mentioned in SAL (#wikimedia-operations) [2025-01-23T10:57:52Z] <jynus> pausing media backups on eqiad for maintenance T383902
Icinga downtime and Alertmanager silence (ID=8968cdca-1368-4a8b-8d7b-88380f4a6dfe) set by jynus@cumin1002 for 4:00:00 on 1 host(s) and their services with reason: os upgrade
db1204.eqiad.wmnet
Icinga downtime and Alertmanager silence (ID=8e426345-55db-4d8d-97b1-479f633bd115) set by jynus@cumin1002 for 4:00:00 on 1 host(s) and their services with reason: os upgrade
db1205.eqiad.wmnet
Cookbook cookbooks.sre.hosts.reimage was started by root@cumin1002 for host db1204.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by root@cumin1002 for host db1204.eqiad.wmnet with OS bookworm completed:
Icinga downtime and Alertmanager silence (ID=6f470676-dea0-4f69-80aa-826d9c313d20) set by jynus@cumin1002 for 2:00:00 on 1 host(s) and their services with reason: reimage
db1239.eqiad.wmnet
Cookbook cookbooks.sre.hosts.reimage was started by root@cumin1002 for host db1239.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by root@cumin1002 for host db1239.eqiad.wmnet with OS bookworm completed:
Icinga downtime and Alertmanager silence (ID=9b068a8d-a257-44ae-b3a9-aa92c844556a) set by jynus@cumin1002 for 6:00:00 on 1 host(s) and their services with reason: os upgrade
db1225.eqiad.wmnet
Cookbook cookbooks.sre.hosts.reimage was started by root@cumin1002 for host db1225.eqiad.wmnet with OS bookworm
Icinga downtime and Alertmanager silence (ID=fef62d94-be78-4342-b8f7-11aec550c58d) set by jynus@cumin1002 for 4:00:00 on 1 host(s) and their services with reason: os upgrade
db1216.eqiad.wmnet
Cookbook cookbooks.sre.hosts.reimage was started by root@cumin1002 for host db1216.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by root@cumin1002 for host db1225.eqiad.wmnet with OS bookworm completed:
Cookbook cookbooks.sre.hosts.reimage started by root@cumin1002 for host db1216.eqiad.wmnet with OS bookworm completed:
Icinga downtime and Alertmanager silence (ID=850d9949-25a8-4e40-abfc-8a9183ee0279) set by jynus@cumin1002 for 3 days, 0:00:00 on 1 host(s) and their services with reason: rebuilding tables
db1216.eqiad.wmnet
Icinga downtime and Alertmanager silence (ID=da5694f7-58b7-4217-8b3e-f5bde76d7490) set by jynus@cumin1002 for 3 days, 0:00:00 on 1 host(s) and their services with reason: reimage
db1171.eqiad.wmnet
Cookbook cookbooks.sre.hosts.reimage was started by root@cumin1002 for host db1171.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by root@cumin1002 for host db1171.eqiad.wmnet with OS bookworm executed with errors:
Cookbook cookbooks.sre.hosts.reimage was started by root@cumin1002 for host db1171.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by root@cumin1002 for host db1171.eqiad.wmnet with OS bookworm completed:
Icinga downtime and Alertmanager silence (ID=cd3eaa16-18c2-4399-afd5-a1186d100dc4) set by jynus@cumin1002 for 3 days, 0:00:00 on 1 host(s) and their services with reason: reimage
db1150.eqiad.wmnet
Cookbook cookbooks.sre.hosts.reimage was started by root@cumin1002 for host db1150.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by root@cumin1002 for host db1150.eqiad.wmnet with OS bookworm completed:
Icinga downtime and Alertmanager silence (ID=ac63cc5e-30f2-4726-b368-8ac5c3ba5641) set by jynus@cumin1002 for 1 day, 0:00:00 on 1 host(s) and their services with reason: test new s4 backups
db2201.codfw.wmnet
Icinga downtime and Alertmanager silence (ID=0f2ee58b-8e61-458b-8036-54ff80c6dc78) set by jynus@cumin1002 for 1 day, 0:00:00 on 1 host(s) and their services with reason: prepare for decom
db2202.codfw.wmnet
Change #1115774 had a related patch set uploaded (by Jcrespo; author: Jcrespo):
[operations/puppet@production] dbbackups: Decommission db2139
Change #1115774 merged by Jcrespo:
[operations/puppet@production] dbbackups: Decommission db2139
This is technically all done (all hosts have been upgraded), but I still need to review grants and hopefully remove and unify them among all hosts.
Change #1112802 merged by Jcrespo:
[operations/puppet@production] dbbackups: Fix dump grants for backup sources and m1
Change #1116845 had a related patch set uploaded (by Jcrespo; author: Jcrespo):
[operations/puppet@production] dbbackups: Update grants for misc hosts other than m1
Change #1116846 had a related patch set uploaded (by Jcrespo; author: Jcrespo):
[operations/puppet@production] dbbackups: Remove last references to dbprov[12]00[12]
Mentioned in SAL (#wikimedia-operations) [2025-02-04T11:48:11Z] <jynus> deploying new backup grants for matomo and analytics_meta T383902
Mentioned in SAL (#wikimedia-operations) [2025-02-04T12:38:18Z] <jynus> deploying new backup grants for ES hosts T383902
Change #1117182 had a related patch set uploaded (by Jcrespo; author: Jcrespo):
[operations/puppet@production] dbbackups: Fix m5 backup grant issues
Change #1116845 merged by Jcrespo:
[operations/puppet@production] dbbackups: Update grants for misc hosts other than m1
Change #1117182 merged by Jcrespo:
[operations/puppet@production] dbbackups: Fix m5 backup grant issues
Change #1116846 merged by Jcrespo:
[operations/puppet@production] dbbackups: Remove last references to dbprov[12]00[12]