These are new hosts:
- db1226
- db1227
- db1228
- db1229
- db1230
- db1231
- db1232
- db1233
- db1234
- db1235
- db1236
- db1237
- db1238
- db1239 - backup source
- db1240 - backup source
- db1241
- db1242
- db1243
- db1244
- db1245 - backup source
- db1246
- db1247
- db1248
- db1249
These are new hosts:
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Restricted Task | |||||
Restricted Task | |||||
Resolved | Arnoldokoth | T353651 VRTS Upgrade to 6.5.3 | |||
Unknown Object (Task) | |||||
Resolved | Jhancock.wm | T342166 Q1:rack/setup/install db12[34-49] | |||
Unknown Object (Task) | |||||
Resolved | Jclark-ctr | T342176 Q1:rack/setup/install db12[26-33] | |||
Resolved | ABran-WMF | T344036 Productionize db12[26-49] | |||
Resolved | ABran-WMF | T350458 Decommission db11[26-49] | |||
Resolved | Request | Jclark-ctr | T351065 decommission db1136.eqiad.wmnet | ||
Resolved | Request | Jclark-ctr | T351063 decommission db1127.eqiad.wmnet | ||
Resolved | Request | Jclark-ctr | T351067 decommission db1130.eqiad.wmnet | ||
Resolved | Request | Jclark-ctr | T352362 decommission db1126.eqiad.wmnet | ||
Resolved | Marostegui | T355541 Install a temporary DB host in m2 to support VRTS migration | |||
Resolved | Request | VRiley-WMF | T355740 decommission db1134.eqiad.wmnet |
Mentioned in SAL (#wikimedia-operations) [2023-12-13T13:24:10Z] <arnaudb@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1232.eqiad.wmnet with reason: provisionning db1232.eqiad.wmnet - T344036
Mentioned in SAL (#wikimedia-operations) [2023-12-13T13:25:12Z] <arnaudb@cumin1001> dbctl commit (dc=all): 'Cloning db1132 in db1232 for T344036', diff saved to https://phabricator.wikimedia.org/P54376 and previous config saved to /var/cache/conftool/dbconfig/20231213-132511-arnaudb.json
Mentioned in SAL (#wikimedia-operations) [2023-12-13T13:44:49Z] <arnaudb@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1129.eqiad.wmnet with reason: provisionning db1233.eqiad.wmnet - T344036
Icinga downtime and Alertmanager silence (ID=f4afd26b-632e-4939-a88b-b73950c450b2) set by arnaudb@cumin1001 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db1233.eqiad.wmnet - T344036
db1129.eqiad.wmnet
Mentioned in SAL (#wikimedia-operations) [2023-12-13T13:44:53Z] <arnaudb@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1129.eqiad.wmnet with reason: provisionning db1233.eqiad.wmnet - T344036
Mentioned in SAL (#wikimedia-operations) [2023-12-13T13:44:56Z] <arnaudb@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1233.eqiad.wmnet with reason: provisionning db1233.eqiad.wmnet - T344036
Icinga downtime and Alertmanager silence (ID=a2a99aac-f52e-4699-8466-079c9eb5ddfb) set by arnaudb@cumin1001 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db1233.eqiad.wmnet - T344036
db1233.eqiad.wmnet
Mentioned in SAL (#wikimedia-operations) [2023-12-13T13:45:23Z] <arnaudb@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1233.eqiad.wmnet with reason: provisionning db1233.eqiad.wmnet - T344036
Mentioned in SAL (#wikimedia-operations) [2023-12-13T13:46:34Z] <arnaudb@cumin1001> dbctl commit (dc=all): 'Cloning db1129 in db1233 for T344036', diff saved to https://phabricator.wikimedia.org/P54379 and previous config saved to /var/cache/conftool/dbconfig/20231213-134632-arnaudb.json
Mentioned in SAL (#wikimedia-operations) [2023-12-13T13:57:50Z] <arnaudb@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1148.eqiad.wmnet with reason: provisionning db1248.eqiad.wmnet - T344036
Icinga downtime and Alertmanager silence (ID=3f28995e-55ad-4c56-a0aa-10e8b8e29b74) set by arnaudb@cumin1001 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db1248.eqiad.wmnet - T344036
db1148.eqiad.wmnet
Mentioned in SAL (#wikimedia-operations) [2023-12-13T13:58:10Z] <arnaudb@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1248.eqiad.wmnet with reason: provisionning db1248.eqiad.wmnet - T344036
Icinga downtime and Alertmanager silence (ID=9e59fa40-4904-40fa-a38c-41fbc8ed4b2a) set by arnaudb@cumin1001 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db1248.eqiad.wmnet - T344036
db1248.eqiad.wmnet
Mentioned in SAL (#wikimedia-operations) [2023-12-13T13:58:25Z] <arnaudb@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1248.eqiad.wmnet with reason: provisionning db1248.eqiad.wmnet - T344036
Mentioned in SAL (#wikimedia-operations) [2023-12-13T14:00:21Z] <arnaudb@cumin1001> dbctl commit (dc=all): 'Cloning db1148 in db1248 for T344036', diff saved to https://phabricator.wikimedia.org/P54380 and previous config saved to /var/cache/conftool/dbconfig/20231213-140017-arnaudb.json
Change 982197 had a related patch set uploaded (by Arnaudb; author: Arnaudb):
[operations/puppet@production] mariadb: toggle notifications for db1211
Change 982197 merged by Arnaudb:
[operations/puppet@production] mariadb: toggle notifications for db1211
Change 982870 had a related patch set uploaded (by Arnaudb; author: Arnaudb):
[operations/puppet@production] mariadb: toggle notification db1226
Change 982870 merged by Arnaudb:
[operations/puppet@production] mariadb: toggle notification db1226
Change 982871 had a related patch set uploaded (by Arnaudb; author: Arnaudb):
[operations/puppet@production] mariadb: repooling 3 hosts
Change 982871 merged by Arnaudb:
[operations/puppet@production] mariadb: repooling 3 hosts
Change 982872 had a related patch set uploaded (by Arnaudb; author: Arnaudb):
[operations/puppet@production] mariadb: toggle notification for db1233
Change 982872 merged by Arnaudb:
[operations/puppet@production] mariadb: toggle notification for db1233
Change 982874 had a related patch set uploaded (by Arnaudb; author: Arnaudb):
[operations/puppet@production] mariadb: productionize db1234 db1249
Change 982874 merged by Arnaudb:
[operations/puppet@production] mariadb: productionize db1234 db1249
Mentioned in SAL (#wikimedia-operations) [2023-12-14T13:08:48Z] <arnaudb@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1149.eqiad.wmnet with reason: provisionning db1249.eqiad.wmnet - T344036
Icinga downtime and Alertmanager silence (ID=72d2fbd8-4367-446a-8ad7-e7d01a2f0e13) set by arnaudb@cumin1001 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db1249.eqiad.wmnet - T344036
db1149.eqiad.wmnet
Mentioned in SAL (#wikimedia-operations) [2023-12-14T13:09:04Z] <arnaudb@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1149.eqiad.wmnet with reason: provisionning db1249.eqiad.wmnet - T344036
Mentioned in SAL (#wikimedia-operations) [2023-12-14T13:09:20Z] <arnaudb@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1249.eqiad.wmnet with reason: provisionning db1249.eqiad.wmnet - T344036
Icinga downtime and Alertmanager silence (ID=62363894-8bec-4a73-8630-36db2515f057) set by arnaudb@cumin1001 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db1249.eqiad.wmnet - T344036
db1249.eqiad.wmnet
Mentioned in SAL (#wikimedia-operations) [2023-12-14T13:09:31Z] <arnaudb@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1249.eqiad.wmnet with reason: provisionning db1249.eqiad.wmnet - T344036
Mentioned in SAL (#wikimedia-operations) [2023-12-14T13:10:17Z] <arnaudb@cumin1001> dbctl commit (dc=all): 'Cloning db1149 in db1249 for T344036', diff saved to https://phabricator.wikimedia.org/P54435 and previous config saved to /var/cache/conftool/dbconfig/20231214-131017-arnaudb.json
Mentioned in SAL (#wikimedia-operations) [2023-12-14T13:17:38Z] <arnaudb@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1134.eqiad.wmnet with reason: provisionning db1234.eqiad.wmnet - T344036
Icinga downtime and Alertmanager silence (ID=dde78a57-e309-4254-9f29-1218e94d75b2) set by arnaudb@cumin1001 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db1234.eqiad.wmnet - T344036
db1134.eqiad.wmnet
Mentioned in SAL (#wikimedia-operations) [2023-12-14T13:17:53Z] <arnaudb@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1134.eqiad.wmnet with reason: provisionning db1234.eqiad.wmnet - T344036
Mentioned in SAL (#wikimedia-operations) [2023-12-14T13:18:05Z] <arnaudb@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1234.eqiad.wmnet with reason: provisionning db1234.eqiad.wmnet - T344036
Icinga downtime and Alertmanager silence (ID=e7e36f50-6456-4234-af76-d45062614baa) set by arnaudb@cumin1001 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db1234.eqiad.wmnet - T344036
db1234.eqiad.wmnet
Mentioned in SAL (#wikimedia-operations) [2023-12-14T13:18:20Z] <arnaudb@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1234.eqiad.wmnet with reason: provisionning db1234.eqiad.wmnet - T344036
Mentioned in SAL (#wikimedia-operations) [2023-12-14T13:19:14Z] <arnaudb@cumin1001> dbctl commit (dc=all): 'Cloning db1134 in db1234 for T344036', diff saved to https://phabricator.wikimedia.org/P54436 and previous config saved to /var/cache/conftool/dbconfig/20231214-131913-arnaudb.json
Change 982883 had a related patch set uploaded (by Arnaudb; author: Arnaudb):
[operations/puppet@production] mariadb: toggle notifications for db1249
Change 982883 merged by Arnaudb:
[operations/puppet@production] mariadb: toggle notifications for db1249 and db1234
Change 994769 had a related patch set uploaded (by Arnaudb; author: Arnaudb):
[operations/puppet@production] mariadb: migrate core multi-instances nodes
Change 994769 merged by Arnaudb:
[operations/puppet@production] mariadb: migrate core multi-instances nodes
FYI: I've recovered from backups db1239 (s1, s2), db1240 (s1) and db1245 (s4, s5)
I have pending to do s3 on db1240 and the finishing touches (prometheus reset, zarcillo, etc.), but at least most are already replicating. Also not done: migrating the backup config.
Mentioned in SAL (#wikimedia-operations) [2024-02-05T12:52:47Z] <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1135.eqiad.wmnet with reason: provisionning db1235.eqiad.wmnet - T344036
Icinga downtime and Alertmanager silence (ID=a01327d9-e1b0-411e-a2cc-06fb04f6e23b) set by arnaudb@cumin1002 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db1235.eqiad.wmnet - T344036
db1135.eqiad.wmnet
Mentioned in SAL (#wikimedia-operations) [2024-02-05T12:53:13Z] <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1135.eqiad.wmnet with reason: provisionning db1235.eqiad.wmnet - T344036
Mentioned in SAL (#wikimedia-operations) [2024-02-05T12:53:16Z] <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1235.eqiad.wmnet with reason: provisionning db1235.eqiad.wmnet - T344036
Icinga downtime and Alertmanager silence (ID=ce0e6ddb-012f-4801-80f4-a478b0c13af7) set by arnaudb@cumin1002 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db1235.eqiad.wmnet - T344036
db1235.eqiad.wmnet
Mentioned in SAL (#wikimedia-operations) [2024-02-05T12:53:41Z] <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1235.eqiad.wmnet with reason: provisionning db1235.eqiad.wmnet - T344036
Mentioned in SAL (#wikimedia-operations) [2024-02-05T12:54:45Z] <arnaudb@cumin1002> dbctl commit (dc=all): 'Cloning db1135 in db1235 for T344036', diff saved to https://phabricator.wikimedia.org/P56211 and previous config saved to /var/cache/conftool/dbconfig/20240205-125444-arnaudb.json
Change 997503 had a related patch set uploaded (by Arnaudb; author: Arnaudb):
[operations/puppet@production] mariadb: add db1235 to production
Change 997504 had a related patch set uploaded (by Arnaudb; author: Arnaudb):
[operations/puppet@production] mariadb: toggle notifications for db1235
Change 997503 abandoned by Arnaudb:
[operations/puppet@production] mariadb: add db1235 to production
Reason:
Change 997504 merged by Arnaudb:
[operations/puppet@production] mariadb: toggle notifications for db1235
Change 998471 had a related patch set uploaded (by Jcrespo; author: Jcrespo):
[operations/puppet@production] dbbackups/mediabackups: Migrate old backup source config to new hosts
Change 998471 merged by Jcrespo:
[operations/puppet@production] dbbackups/mediabackups: Migrate old backup source config to new hosts
New servers are in use, the old backup sources are idle (no service uses them), but still replicating/available. All work on my side is done, unless for some reason the backups on the new host start to fail.
Change 1003050 had a related patch set uploaded (by Jcrespo; author: Jcrespo):
[operations/puppet@production] mariadb: Reenable notifications on backup sources
Change 1003050 merged by Jcrespo:
[operations/puppet@production] mariadb: Reenable notifications on backup sources