Page MenuHomePhabricator

Move wikireplicas under the new sanitarium hosts (db1154, db1155)
Closed, ResolvedPublic

Description

Once the new sanitarium hosts running Buster and MariaDB 10.4 + Stretch are ready, we need to start moving the wikireplicas under them

All these run 10.4 so they can be moved without any blockers

  • clouddb1013:3311
  • clouddb1013:3313
  • clouddb1014:3312
  • clouddb1014:3317
  • clouddb1015:3314
  • clouddb1015:3316
  • clouddb1016:3315
  • clouddb1016:3318
  • clouddb1017:3311
  • clouddb1017:3313
  • clouddb1018:3312
  • clouddb1018:3317
  • clouddb1019:3314
  • clouddb1019:3316
  • clouddb1020:3315
  • clouddb1020:3318

10.1 replicas, they cannot be moved until we've got the green light from cloud-services-team as replication might break anytime:

  • labsdb1009
  • labsdb1010
  • labsdb1011

Sections to move on labsdb1009, labsdb1010, labsdb1011:

  • s1
  • s2
  • s3
  • s4
  • s5
  • s6
  • s7
  • s8

This replica belongs to Analytics and it is probably better just to rebuild it as multi-instance+10.4+stretch rather than moving it under the new replicas T269211: Convert labsdb1012 from multi-source to multi-instance

labsdb1012

Update: labsdb1012 is no longer, it has been converted to clouddb1021.

Related Objects

StatusSubtypeAssignedTask
ResolvedMarostegui
OpenNone
OpenNone
ResolvedRobH
ResolvedBstorm
ResolvedBstorm
ResolvedMarostegui
ResolvedMarostegui
StalledNone
ResolvedNone
ResolvedMarostegui
ResolvedMarostegui
ResolvedRobH
ResolvedMarostegui
ResolvedMarostegui
Resolved Cmjohnson
Resolveddcaro
ResolvedMarostegui
ResolvedRequestwiki_willy
ResolvedRequest Cmjohnson
ResolvedRequest Cmjohnson
ResolvedRequest Cmjohnson
ResolvedRequest Cmjohnson

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Marostegui moved this task from Ready to In progress on the DBA board.

Mentioned in SAL (#wikimedia-operations) [2021-01-18T08:17:40Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1106 to stop replication, place db1105:3311 temporarily in vslow T272008', diff saved to https://phabricator.wikimedia.org/P13787 and previous config saved to /var/cache/conftool/dbconfig/20210118-081740-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2021-01-18T09:25:46Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1074 to stop replication T272008', diff saved to https://phabricator.wikimedia.org/P13795 and previous config saved to /var/cache/conftool/dbconfig/20210118-092546-marostegui.json

I have moved 4 out of the new 16 instances under the new hosts. Won't move more today, to make sure everything runs fine.

Mentioned in SAL (#wikimedia-operations) [2021-01-19T06:57:49Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1082 to stop replication T272008', diff saved to https://phabricator.wikimedia.org/P13821 and previous config saved to /var/cache/conftool/dbconfig/20210119-065748-marostegui.json

clouddb1016:3315 and clouddb1020:3315 moved

Mentioned in SAL (#wikimedia-operations) [2021-01-19T08:58:57Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1078 to stop replication T272008', diff saved to https://phabricator.wikimedia.org/P13826 and previous config saved to /var/cache/conftool/dbconfig/20210119-085856-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2021-01-19T09:01:00Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1112 to stop replication T272008', diff saved to https://phabricator.wikimedia.org/P13828 and previous config saved to /var/cache/conftool/dbconfig/20210119-090100-marostegui.json

clouddb1013:3313 and clouddb1017:3313 moved

Mentioned in SAL (#wikimedia-operations) [2021-01-20T10:34:50Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1079 to stop replication T272008', diff saved to https://phabricator.wikimedia.org/P13842 and previous config saved to /var/cache/conftool/dbconfig/20210120-103449-marostegui.json

clouddb1014:3317 and clouddb1018:3317 moved.

clouddb1016:3318 and clouddb1020:3318 moved.

clouddb1015:3316 moved - clouddb1019:3316 is down due to HW issues: T272125

Marostegui changed the task status from Open to Stalled.Jan 22 2021, 1:33 PM
Marostegui updated the task description. (Show Details)

clouddb1015:3314 moved.
The only pending host is clouddb1019 which is waiting for on-site maintenance as it is inaccessible (T272125)

Mentioned in SAL (#wikimedia-operations) [2021-01-27T07:05:03Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1085 T272008', diff saved to https://phabricator.wikimedia.org/P13968 and previous config saved to /var/cache/conftool/dbconfig/20210127-070502-marostegui.json

clouddb1019:3316 moved under db1155:3316

Marostegui moved this task from In progress to Blocked on the DBA board.
Marostegui added subscribers: nskaggs, Bstorm.

clouddb1019:3314 moved under db1155:3314

All the new clouddb hosts are moved under the new 10.4 sanitariums. This task is now stalled - waiting on the green light to move labsdb* hosts under the new sanitarium once we are ready to afford that replication can break anytime when going from 10.4 to 10.1

This can happen after 15th April

Marostegui changed the task status from Stalled to Open.Apr 16 2021, 5:29 AM
Marostegui moved this task from Blocked to Ready on the DBA board.

Mentioned in SAL (#wikimedia-operations) [2021-04-19T05:41:58Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1106 T272008', diff saved to https://phabricator.wikimedia.org/P15406 and previous config saved to /var/cache/conftool/dbconfig/20210419-054158-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2021-04-19T05:42:15Z] <marostegui> Stop sanitarium master on s1 (lag will show up on clouddb* labsdb* hosts) T272008

Mentioned in SAL (#wikimedia-operations) [2021-04-19T05:52:40Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1074 T272008', diff saved to https://phabricator.wikimedia.org/P15408 and previous config saved to /var/cache/conftool/dbconfig/20210419-055240-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2021-04-19T05:53:11Z] <marostegui> Stop sanitarium master on s2 (lag will show up on clouddb* labsdb* hosts) T272008

Mentioned in SAL (#wikimedia-operations) [2021-04-19T06:46:01Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1112 T272008', diff saved to https://phabricator.wikimedia.org/P15414 and previous config saved to /var/cache/conftool/dbconfig/20210419-064600-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2021-04-19T07:00:35Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1121 T272008', diff saved to https://phabricator.wikimedia.org/P15418 and previous config saved to /var/cache/conftool/dbconfig/20210419-070035-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2021-04-19T07:17:02Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1082 T272008', diff saved to https://phabricator.wikimedia.org/P15422 and previous config saved to /var/cache/conftool/dbconfig/20210419-071701-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2021-04-19T07:41:56Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1085 T272008', diff saved to https://phabricator.wikimedia.org/P15430 and previous config saved to /var/cache/conftool/dbconfig/20210419-074155-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2021-04-19T08:26:00Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1079 T272008', diff saved to https://phabricator.wikimedia.org/P15442 and previous config saved to /var/cache/conftool/dbconfig/20210419-082559-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2021-04-19T08:45:24Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1087 T272008', diff saved to https://phabricator.wikimedia.org/P15446 and previous config saved to /var/cache/conftool/dbconfig/20210419-084523-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2021-04-19T08:48:38Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Repool db1087 T272008', diff saved to https://phabricator.wikimedia.org/P15448 and previous config saved to /var/cache/conftool/dbconfig/20210419-084834-marostegui.json

labsdb* hosts are all now running under 10.4 sanitariums.