Page MenuHomePhabricator

Productionize db1155-db1175 and refresh and decommission db1074-db1095 (22 servers)
Closed, ResolvedPublic

Description

The following hosts are scheduled to be decommissioned in Q2 and need to be refreshed:
db1074-db1095 (22 servers)

We have started to see issues on them, especially BBU related: T258360 T258360

Replacement plan:

  • db1124 B1 (old sanitarium host, currently in use) to be placed in s7
  • db1125 D1 (old sanitarium host, currently in use) to replace db1077 in testing-s4
  • db1156 A1 to replace db1074 (sanitarium master)
  • db1157 A5 to replace db1075
  • db1158 A5 to replace db1079 (sanitarium master)
  • db1159 A6 to replace db1080 (m1 master)
  • db1160 A6 to replace db1081
  • db1161 A8 to replace db1082 (sanitarium master)
  • db1162 B1 to replace db1076 (candidate master)
  • db1163 B3 to replace db1083 (s1 master) CURRENTLY pooled on s1 as stretch to substitute db1134 T274472
  • db1164 B5 to replace db1084
  • db1165 B6 to replace db1085 (sanitarium master) T258361#6923913
  • db1166 C3 to replace db1078
  • db1167 C3 to replace db1087 (sanitarium master)
  • db1168 C5 to replace db1088
  • db1169 C5 to replace db1089
  • db1170 C6 to replace db1090 (multi-instance)
  • db1171 C6 to replace db1095
  • db1172 D1 to replace db1092
  • db1173 D3 to replace db1093 (candidate master)
  • db1174 D6 to replace db1094
  • db1175 D3 to be placed on s3

Decommissioning progress

Details

ProjectBranchLines +/-Subject
operations/puppetproduction+1 -0
operations/puppetproduction+1 -0
operations/puppetproduction+0 -1
operations/puppetproduction+4 -2
operations/puppetproduction+1 -0
operations/puppetproduction+0 -1
operations/puppetproduction+2 -3
operations/puppetproduction+7 -7
operations/puppetproduction+1 -1
operations/puppetproduction+1 -0
operations/puppetproduction+0 -1
operations/puppetproduction+3 -2
operations/puppetproduction+5 -11
operations/puppetproduction+0 -1
operations/puppetproduction+4 -2
operations/puppetproduction+2 -0
operations/puppetproduction+0 -1
operations/puppetproduction+1 -3
operations/puppetproduction+1 -1
operations/puppetproduction+1 -1
operations/puppetproduction+1 -1
operations/puppetproduction+1 -1
operations/puppetproduction+0 -1
operations/puppetproduction+1 -0
operations/puppetproduction+1 -0
operations/puppetproduction+4 -2
operations/puppetproduction+1 -0
operations/puppetproduction+0 -1
operations/puppetproduction+1 -0
operations/puppetproduction+1 -0
operations/puppetproduction+4 -2
operations/puppetproduction+1 -1
operations/puppetproduction+1 -0
operations/puppetproduction+0 -1
operations/puppetproduction+0 -1
operations/puppetproduction+8 -1
operations/puppetproduction+0 -1
operations/puppetproduction+1 -0
operations/puppetproduction+3 -2
operations/puppetproduction+1 -1
operations/puppetproduction+1 -0
operations/puppetproduction+0 -1
operations/puppetproduction+3 -2
operations/puppetproduction+1 -0
operations/puppetproduction+0 -1
operations/puppetproduction+3 -2
operations/puppetproduction+1 -0
operations/puppetproduction+0 -1
operations/puppetproduction+2 -2
operations/puppetproduction+1 -0
operations/puppetproduction+5 -2
operations/puppetproduction+1 -1
operations/puppetproduction+5 -2
operations/puppetproduction+2 -0
operations/puppetproduction+0 -1
operations/puppetproduction+1 -0
operations/puppetproduction+0 -1
operations/puppetproduction+1 -0
operations/puppetproduction+8 -4
operations/puppetproduction+2 -2
operations/puppetproduction+5 -2
operations/puppetproduction+1 -0
operations/puppetproduction+1 -0
operations/puppetproduction+0 -1
operations/puppetproduction+5 -2
operations/puppetproduction+1 -0
operations/puppetproduction+1 -0
operations/puppetproduction+3 -2
operations/puppetproduction+3 -2
operations/puppetproduction+9 -2
operations/puppetproduction+1 -0
operations/puppetproduction+1 -1
operations/puppetproduction+1 -0
operations/puppetproduction+0 -1
operations/puppetproduction+4 -2
operations/puppetproduction+4 -2
operations/puppetproduction+0 -1
operations/puppetproduction+1 -0
operations/puppetproduction+3 -2
operations/puppetproduction+1 -1
operations/puppetproduction+0 -1
operations/puppetproduction+1 -0
operations/puppetproduction+5 -2
operations/puppetproduction+1 -0
operations/puppetproduction+1 -0
Show related patches Customize query in gerrit

Related Objects

StatusSubtypeAssignedTask
ResolvedMarostegui
OpenNone
OpenJhernandez
ResolvedRobH
OpenBstorm
ResolvedBstorm
ResolvedMarostegui
ResolvedMarostegui
OpenNone
OpenNone
OpenNone
ResolvedMarostegui
DeclinedNone
ResolvedMarostegui
ResolvedJclark-ctr
ResolvedMarostegui
ResolvedMarostegui
ResolvedRequestwiki_willy
ResolvedMarostegui
ResolvedTrizek-WMF
ResolvedKormat
ResolvedMarostegui
Resolvedsgrabarczuk
ResolvedCmjohnson
ResolvedCmjohnson
ResolvedRobH
ResolvedMarostegui
ResolvedMarostegui
ResolvedMarostegui
ResolvedRequestwiki_willy
ResolvedRequestCmjohnson
ResolvedRequestCmjohnson
ResolvedRequestCmjohnson
ResolvedRequestCmjohnson
ResolvedRequestCmjohnson
ResolvedRequestCmjohnson
ResolvedRequestCmjohnson
ResolvedRequestCmjohnson
ResolvedRequestCmjohnson
ResolvedRequestCmjohnson
ResolvedRequestCmjohnson
ResolvedRequestCmjohnson
ResolvedCmjohnson
ResolvedRequestCmjohnson
ResolvedRequestCmjohnson
ResolvedMarostegui
ResolvedRequestCmjohnson
ResolvedMarostegui
StalledMarostegui
Resolvedsgrabarczuk
ResolvedRequestCmjohnson
ResolvedRequestCmjohnson
ResolvedRequestCmjohnson

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change 682355 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] instances.yaml: Add db1158

https://gerrit.wikimedia.org/r/682355

Change 682355 merged by Marostegui:

[operations/puppet@production] instances.yaml: Add db1158

https://gerrit.wikimedia.org/r/682355

Completed auto-reimage of hosts:

['db1124.eqiad.wmnet']

and were ALL successful.

Mentioned in SAL (#wikimedia-operations) [2021-04-26T05:47:01Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Add db1158 to dbctl, depooled, T258361', diff saved to https://phabricator.wikimedia.org/P15521 and previous config saved to /var/cache/conftool/dbconfig/20210426-054700-marostegui.json

checking tables on db1124 after the transfer from db1158

Change 682495 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] install_server: Do not format db1124

https://gerrit.wikimedia.org/r/682495

Change 682495 merged by Marostegui:

[operations/puppet@production] install_server: Do not format db1124

https://gerrit.wikimedia.org/r/682495

Change 682570 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] mariadb: Move db1125 from sanitarium to testing

https://gerrit.wikimedia.org/r/682570

Change 682570 merged by Marostegui:

[operations/puppet@production] mariadb: Move db1125 from sanitarium to testing

https://gerrit.wikimedia.org/r/682570

Script wmf-auto-reimage was launched by marostegui on cumin1001.eqiad.wmnet for hosts:

['db1125.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202104261017_marostegui_29781.log.

Change 682571 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] check_private_data_report: Remove db1125

https://gerrit.wikimedia.org/r/682571

Change 682571 merged by Marostegui:

[operations/puppet@production] check_private_data_report: Remove db1125

https://gerrit.wikimedia.org/r/682571

Completed auto-reimage of hosts:

['db1125.eqiad.wmnet']

and were ALL successful.

Change 682794 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] db1124: Enable notifications

https://gerrit.wikimedia.org/r/682794

Change 682794 merged by Marostegui:

[operations/puppet@production] db1124: Enable notifications

https://gerrit.wikimedia.org/r/682794

Change 682795 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] instances.yaml: Add db1124 to dbctl

https://gerrit.wikimedia.org/r/682795

Change 682795 merged by Marostegui:

[operations/puppet@production] instances.yaml: Add db1124 to dbctl

https://gerrit.wikimedia.org/r/682795

Mentioned in SAL (#wikimedia-operations) [2021-04-27T04:45:20Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Add db1124 to dbctl, depooled, T258361', diff saved to https://phabricator.wikimedia.org/P15540 and previous config saved to /var/cache/conftool/dbconfig/20210427-044520-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2021-04-27T04:46:10Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Add db1124 with minimal weight for the first time in s7 T258361', diff saved to https://phabricator.wikimedia.org/P15541 and previous config saved to /var/cache/conftool/dbconfig/20210427-044609-marostegui.json

Pooled db1124 with minimal weight for the first time in s7

Mentioned in SAL (#wikimedia-operations) [2021-04-27T05:08:27Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Add db1124 with minimal weight for the first time in s7 T258361', diff saved to https://phabricator.wikimedia.org/P15545 and previous config saved to /var/cache/conftool/dbconfig/20210427-050826-marostegui.json

I am automatically pooling db1124 into s7.

Mentioned in SAL (#wikimedia-operations) [2021-04-27T05:21:25Z] <marostegui> Stop mysql on db1087 to clone db1167 (lag will appear on wikidata on wikireplicas) T258361

Change 682885 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] mariadb: Productionize db1167

https://gerrit.wikimedia.org/r/682885

Change 682885 merged by Marostegui:

[operations/puppet@production] mariadb: Productionize db1167

https://gerrit.wikimedia.org/r/682885

Change 683121 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] db1167: Enable notifications

https://gerrit.wikimedia.org/r/683121

Change 683121 merged by Marostegui:

[operations/puppet@production] db1167: Enable notifications

https://gerrit.wikimedia.org/r/683121

Change 683124 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] instances.yaml: Add db1167 to dbctl

https://gerrit.wikimedia.org/r/683124

Change 683124 merged by Marostegui:

[operations/puppet@production] instances.yaml: Add db1167 to dbctl

https://gerrit.wikimedia.org/r/683124

Mentioned in SAL (#wikimedia-operations) [2021-04-28T05:51:45Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Pool db1167 in s8 T258361', diff saved to https://phabricator.wikimedia.org/P15605 and previous config saved to /var/cache/conftool/dbconfig/20210428-055144-marostegui.json

Change 683474 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] instances.yaml: Add db1156 to dbctl

https://gerrit.wikimedia.org/r/683474

Change 683474 merged by Marostegui:

[operations/puppet@production] instances.yaml: Add db1156 to dbctl

https://gerrit.wikimedia.org/r/683474

Mentioned in SAL (#wikimedia-operations) [2021-04-29T04:38:13Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Add db1156 to dbctl T258361', diff saved to https://phabricator.wikimedia.org/P15624 and previous config saved to /var/cache/conftool/dbconfig/20210429-043812-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2021-04-29T04:38:57Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Pool db1156 into s2 for the first time with minimal weight T258361', diff saved to https://phabricator.wikimedia.org/P15625 and previous config saved to /var/cache/conftool/dbconfig/20210429-043857-marostegui.json

db1156 pooled in s2 with minimal weight

Mentioned in SAL (#wikimedia-operations) [2021-04-29T04:44:58Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Pool db1156 into s2 for the first time with minimal weight T258361', diff saved to https://phabricator.wikimedia.org/P15626 and previous config saved to /var/cache/conftool/dbconfig/20210429-044458-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2021-04-29T04:50:15Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Pool db1156 into s2 for the first time with minimal weight T258361', diff saved to https://phabricator.wikimedia.org/P15627 and previous config saved to /var/cache/conftool/dbconfig/20210429-045015-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2021-04-29T04:55:57Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Pool db1156 into s2 for the first time with minimal weight T258361', diff saved to https://phabricator.wikimedia.org/P15629 and previous config saved to /var/cache/conftool/dbconfig/20210429-045557-marostegui.json

Automatically pooling db1156 into s2.

All the hosts in this task have been productionized. Pending: decommission the old ones.

Marostegui updated the task description. (Show Details)

All hosts that are scheduled for decommissioning are now ready (but waiting a few days to make sure their replacement work ok) and have their own decommissioning tasks.
Closing this as resolved.