Page MenuHomePhabricator

Productionize db12[26-49]
Closed, ResolvedPublic

Description

These are new hosts:

  • db1226
  • db1227
  • db1228
  • db1229
  • db1230
  • db1231
  • db1232
  • db1233
  • db1234
  • db1235
  • db1236
  • db1237
  • db1238
  • db1239 - backup source
  • db1240 - backup source
  • db1241
  • db1242
  • db1243
  • db1244
  • db1245 - backup source
  • db1246
  • db1247
  • db1248
  • db1249

Details

SubjectRepoBranchLines +/-
operations/puppetproduction+0 -3
operations/puppetproduction+7 -7
operations/puppetproduction+0 -1
operations/puppetproduction+0 -1
operations/puppetproduction+29 -8
operations/puppetproduction+1 -3
operations/puppetproduction+11 -3
operations/puppetproduction+0 -1
operations/puppetproduction+0 -3
operations/puppetproduction+0 -1
operations/puppetproduction+1 -2
operations/puppetproduction+14 -4
operations/puppetproduction+1 -5
operations/puppetproduction+1 -0
operations/puppetproduction+7 -2
operations/puppetproduction+5 -2
operations/puppetproduction+5 -2
operations/puppetproduction+6 -3
operations/puppetproduction+1 -0
operations/puppetproduction+4 -2
operations/puppetproduction+1 -3
operations/puppetproduction+6 -2
operations/puppetproduction+3 -0
operations/puppetproduction+5 -2
operations/puppetproduction+6 -3
operations/puppetproduction+0 -1
operations/puppetproduction+3 -1
operations/puppetproduction+5 -3
operations/puppetproduction+0 -2
operations/puppetproduction+0 -1
operations/puppetproduction+15 -8
operations/puppetproduction+6 -6
operations/puppetproduction+1 -2
operations/puppetproduction+4 -2
operations/puppetproduction+5 -2
operations/puppetproduction+0 -1
operations/puppetproduction+6 -2
operations/puppetproduction+7 -2
Show related patches Customize query in gerrit

Related Objects

StatusSubtypeAssignedTask
ResolvedArnoldokoth
ResolvedJhancock.wm
ResolvedJclark-ctr
ResolvedABran-WMF
ResolvedABran-WMF
ResolvedRequestJclark-ctr
ResolvedRequestJclark-ctr
ResolvedRequestJclark-ctr
ResolvedRequestJclark-ctr
ResolvedMarostegui
ResolvedRequestVRiley-WMF

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Mentioned in SAL (#wikimedia-operations) [2023-12-13T13:24:10Z] <arnaudb@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1232.eqiad.wmnet with reason: provisionning db1232.eqiad.wmnet - T344036

Mentioned in SAL (#wikimedia-operations) [2023-12-13T13:25:12Z] <arnaudb@cumin1001> dbctl commit (dc=all): 'Cloning db1132 in db1232 for T344036', diff saved to https://phabricator.wikimedia.org/P54376 and previous config saved to /var/cache/conftool/dbconfig/20231213-132511-arnaudb.json

Mentioned in SAL (#wikimedia-operations) [2023-12-13T13:44:49Z] <arnaudb@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1129.eqiad.wmnet with reason: provisionning db1233.eqiad.wmnet - T344036

Icinga downtime and Alertmanager silence (ID=f4afd26b-632e-4939-a88b-b73950c450b2) set by arnaudb@cumin1001 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db1233.eqiad.wmnet - T344036

db1129.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2023-12-13T13:44:53Z] <arnaudb@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1129.eqiad.wmnet with reason: provisionning db1233.eqiad.wmnet - T344036

Mentioned in SAL (#wikimedia-operations) [2023-12-13T13:44:56Z] <arnaudb@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1233.eqiad.wmnet with reason: provisionning db1233.eqiad.wmnet - T344036

Icinga downtime and Alertmanager silence (ID=a2a99aac-f52e-4699-8466-079c9eb5ddfb) set by arnaudb@cumin1001 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db1233.eqiad.wmnet - T344036

db1233.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2023-12-13T13:45:23Z] <arnaudb@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1233.eqiad.wmnet with reason: provisionning db1233.eqiad.wmnet - T344036

Mentioned in SAL (#wikimedia-operations) [2023-12-13T13:46:34Z] <arnaudb@cumin1001> dbctl commit (dc=all): 'Cloning db1129 in db1233 for T344036', diff saved to https://phabricator.wikimedia.org/P54379 and previous config saved to /var/cache/conftool/dbconfig/20231213-134632-arnaudb.json

Mentioned in SAL (#wikimedia-operations) [2023-12-13T13:57:50Z] <arnaudb@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1148.eqiad.wmnet with reason: provisionning db1248.eqiad.wmnet - T344036

Icinga downtime and Alertmanager silence (ID=3f28995e-55ad-4c56-a0aa-10e8b8e29b74) set by arnaudb@cumin1001 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db1248.eqiad.wmnet - T344036

db1148.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2023-12-13T13:58:10Z] <arnaudb@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1248.eqiad.wmnet with reason: provisionning db1248.eqiad.wmnet - T344036

Icinga downtime and Alertmanager silence (ID=9e59fa40-4904-40fa-a38c-41fbc8ed4b2a) set by arnaudb@cumin1001 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db1248.eqiad.wmnet - T344036

db1248.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2023-12-13T13:58:25Z] <arnaudb@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1248.eqiad.wmnet with reason: provisionning db1248.eqiad.wmnet - T344036

Mentioned in SAL (#wikimedia-operations) [2023-12-13T14:00:21Z] <arnaudb@cumin1001> dbctl commit (dc=all): 'Cloning db1148 in db1248 for T344036', diff saved to https://phabricator.wikimedia.org/P54380 and previous config saved to /var/cache/conftool/dbconfig/20231213-140017-arnaudb.json

Change 982197 had a related patch set uploaded (by Arnaudb; author: Arnaudb):

[operations/puppet@production] mariadb: toggle notifications for db1211

https://gerrit.wikimedia.org/r/982197

Change 982197 merged by Arnaudb:

[operations/puppet@production] mariadb: toggle notifications for db1211

https://gerrit.wikimedia.org/r/982197

Will have to retry db1233 productionizing on s2

Change 982870 had a related patch set uploaded (by Arnaudb; author: Arnaudb):

[operations/puppet@production] mariadb: toggle notification db1226

https://gerrit.wikimedia.org/r/982870

Change 982870 merged by Arnaudb:

[operations/puppet@production] mariadb: toggle notification db1226

https://gerrit.wikimedia.org/r/982870

Change 982871 had a related patch set uploaded (by Arnaudb; author: Arnaudb):

[operations/puppet@production] mariadb: repooling 3 hosts

https://gerrit.wikimedia.org/r/982871

Change 982871 merged by Arnaudb:

[operations/puppet@production] mariadb: repooling 3 hosts

https://gerrit.wikimedia.org/r/982871

Change 982872 had a related patch set uploaded (by Arnaudb; author: Arnaudb):

[operations/puppet@production] mariadb: toggle notification for db1233

https://gerrit.wikimedia.org/r/982872

Change 982872 merged by Arnaudb:

[operations/puppet@production] mariadb: toggle notification for db1233

https://gerrit.wikimedia.org/r/982872

Change 982874 had a related patch set uploaded (by Arnaudb; author: Arnaudb):

[operations/puppet@production] mariadb: productionize db1234 db1249

https://gerrit.wikimedia.org/r/982874

Change 982874 merged by Arnaudb:

[operations/puppet@production] mariadb: productionize db1234 db1249

https://gerrit.wikimedia.org/r/982874

Mentioned in SAL (#wikimedia-operations) [2023-12-14T13:08:48Z] <arnaudb@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1149.eqiad.wmnet with reason: provisionning db1249.eqiad.wmnet - T344036

Icinga downtime and Alertmanager silence (ID=72d2fbd8-4367-446a-8ad7-e7d01a2f0e13) set by arnaudb@cumin1001 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db1249.eqiad.wmnet - T344036

db1149.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2023-12-14T13:09:04Z] <arnaudb@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1149.eqiad.wmnet with reason: provisionning db1249.eqiad.wmnet - T344036

Mentioned in SAL (#wikimedia-operations) [2023-12-14T13:09:20Z] <arnaudb@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1249.eqiad.wmnet with reason: provisionning db1249.eqiad.wmnet - T344036

Icinga downtime and Alertmanager silence (ID=62363894-8bec-4a73-8630-36db2515f057) set by arnaudb@cumin1001 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db1249.eqiad.wmnet - T344036

db1249.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2023-12-14T13:09:31Z] <arnaudb@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1249.eqiad.wmnet with reason: provisionning db1249.eqiad.wmnet - T344036

Mentioned in SAL (#wikimedia-operations) [2023-12-14T13:10:17Z] <arnaudb@cumin1001> dbctl commit (dc=all): 'Cloning db1149 in db1249 for T344036', diff saved to https://phabricator.wikimedia.org/P54435 and previous config saved to /var/cache/conftool/dbconfig/20231214-131017-arnaudb.json

Mentioned in SAL (#wikimedia-operations) [2023-12-14T13:17:38Z] <arnaudb@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1134.eqiad.wmnet with reason: provisionning db1234.eqiad.wmnet - T344036

Icinga downtime and Alertmanager silence (ID=dde78a57-e309-4254-9f29-1218e94d75b2) set by arnaudb@cumin1001 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db1234.eqiad.wmnet - T344036

db1134.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2023-12-14T13:17:53Z] <arnaudb@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1134.eqiad.wmnet with reason: provisionning db1234.eqiad.wmnet - T344036

Mentioned in SAL (#wikimedia-operations) [2023-12-14T13:18:05Z] <arnaudb@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1234.eqiad.wmnet with reason: provisionning db1234.eqiad.wmnet - T344036

Icinga downtime and Alertmanager silence (ID=e7e36f50-6456-4234-af76-d45062614baa) set by arnaudb@cumin1001 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db1234.eqiad.wmnet - T344036

db1234.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2023-12-14T13:18:20Z] <arnaudb@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1234.eqiad.wmnet with reason: provisionning db1234.eqiad.wmnet - T344036

Mentioned in SAL (#wikimedia-operations) [2023-12-14T13:19:14Z] <arnaudb@cumin1001> dbctl commit (dc=all): 'Cloning db1134 in db1234 for T344036', diff saved to https://phabricator.wikimedia.org/P54436 and previous config saved to /var/cache/conftool/dbconfig/20231214-131913-arnaudb.json

Change 982883 had a related patch set uploaded (by Arnaudb; author: Arnaudb):

[operations/puppet@production] mariadb: toggle notifications for db1249

https://gerrit.wikimedia.org/r/982883

Change 982883 merged by Arnaudb:

[operations/puppet@production] mariadb: toggle notifications for db1249 and db1234

https://gerrit.wikimedia.org/r/982883

ABran-WMF changed the task status from Open to In Progress.Dec 20 2023, 3:43 PM
ABran-WMF changed the status of subtask T350458: Decommission db11[26-49] from Open to In Progress.

Change 994769 had a related patch set uploaded (by Arnaudb; author: Arnaudb):

[operations/puppet@production] mariadb: migrate core multi-instances nodes

https://gerrit.wikimedia.org/r/994769

Change 994769 merged by Arnaudb:

[operations/puppet@production] mariadb: migrate core multi-instances nodes

https://gerrit.wikimedia.org/r/994769

FYI: I've recovered from backups db1239 (s1, s2), db1240 (s1) and db1245 (s4, s5)

I have pending to do s3 on db1240 and the finishing touches (prometheus reset, zarcillo, etc.), but at least most are already replicating. Also not done: migrating the backup config.

Mentioned in SAL (#wikimedia-operations) [2024-02-05T12:52:47Z] <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1135.eqiad.wmnet with reason: provisionning db1235.eqiad.wmnet - T344036

Icinga downtime and Alertmanager silence (ID=a01327d9-e1b0-411e-a2cc-06fb04f6e23b) set by arnaudb@cumin1002 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db1235.eqiad.wmnet - T344036

db1135.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2024-02-05T12:53:13Z] <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1135.eqiad.wmnet with reason: provisionning db1235.eqiad.wmnet - T344036

Mentioned in SAL (#wikimedia-operations) [2024-02-05T12:53:16Z] <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1235.eqiad.wmnet with reason: provisionning db1235.eqiad.wmnet - T344036

Icinga downtime and Alertmanager silence (ID=ce0e6ddb-012f-4801-80f4-a478b0c13af7) set by arnaudb@cumin1002 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db1235.eqiad.wmnet - T344036

db1235.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2024-02-05T12:53:41Z] <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1235.eqiad.wmnet with reason: provisionning db1235.eqiad.wmnet - T344036

Mentioned in SAL (#wikimedia-operations) [2024-02-05T12:54:45Z] <arnaudb@cumin1002> dbctl commit (dc=all): 'Cloning db1135 in db1235 for T344036', diff saved to https://phabricator.wikimedia.org/P56211 and previous config saved to /var/cache/conftool/dbconfig/20240205-125444-arnaudb.json

Change 997503 had a related patch set uploaded (by Arnaudb; author: Arnaudb):

[operations/puppet@production] mariadb: add db1235 to production

https://gerrit.wikimedia.org/r/997503

Change 997504 had a related patch set uploaded (by Arnaudb; author: Arnaudb):

[operations/puppet@production] mariadb: toggle notifications for db1235

https://gerrit.wikimedia.org/r/997504

Change 997503 abandoned by Arnaudb:

[operations/puppet@production] mariadb: add db1235 to production

Reason:

https://gerrit.wikimedia.org/r/997503

Change 997504 merged by Arnaudb:

[operations/puppet@production] mariadb: toggle notifications for db1235

https://gerrit.wikimedia.org/r/997504

Change 998471 had a related patch set uploaded (by Jcrespo; author: Jcrespo):

[operations/puppet@production] dbbackups/mediabackups: Migrate old backup source config to new hosts

https://gerrit.wikimedia.org/r/998471

Change 998471 merged by Jcrespo:

[operations/puppet@production] dbbackups/mediabackups: Migrate old backup source config to new hosts

https://gerrit.wikimedia.org/r/998471

New servers are in use, the old backup sources are idle (no service uses them), but still replicating/available. All work on my side is done, unless for some reason the backups on the new host start to fail.

@ABran-WMF what's pending? All hosts done and pooled? Good to close?

Change 1003050 had a related patch set uploaded (by Jcrespo; author: Jcrespo):

[operations/puppet@production] mariadb: Reenable notifications on backup sources

https://gerrit.wikimedia.org/r/1003050

Change 1003050 merged by Jcrespo:

[operations/puppet@production] mariadb: Reenable notifications on backup sources

https://gerrit.wikimedia.org/r/1003050