Page MenuHomePhabricator

Productionize db1155-db1175 and refresh and decommission db1074-db1095 (22 servers)
Closed, ResolvedPublic

Description

The following hosts are scheduled to be decommissioned in Q2 and need to be refreshed:
db1074-db1095 (22 servers)

We have started to see issues on them, especially BBU related: T258360 T258360

Replacement plan:

  • db1124 B1 (old sanitarium host, currently in use) to be placed in s7
  • db1125 D1 (old sanitarium host, currently in use) to replace db1077 in testing-s4
  • db1156 A1 to replace db1074 (sanitarium master)
  • db1157 A5 to replace db1075
  • db1158 A5 to replace db1079 (sanitarium master)
  • db1159 A6 to replace db1080 (m1 master)
  • db1160 A6 to replace db1081
  • db1161 A8 to replace db1082 (sanitarium master)
  • db1162 B1 to replace db1076 (candidate master)
  • db1163 B3 to replace db1083 (s1 master) CURRENTLY pooled on s1 as stretch to substitute db1134 T274472
  • db1164 B5 to replace db1084
  • db1165 B6 to replace db1085 (sanitarium master) T258361#6923913
  • db1166 C3 to replace db1078
  • db1167 C3 to replace db1087 (sanitarium master)
  • db1168 C5 to replace db1088
  • db1169 C5 to replace db1089
  • db1170 C6 to replace db1090 (multi-instance)
  • db1171 C6 to replace db1095
  • db1172 D1 to replace db1092
  • db1173 D3 to replace db1093 (candidate master)
  • db1174 D6 to replace db1094
  • db1175 D3 to be placed on s3

Decommissioning progress

Details

SubjectRepoBranchLines +/-
operations/puppetproduction+1 -0
operations/puppetproduction+1 -0
operations/puppetproduction+0 -1
operations/puppetproduction+4 -2
operations/puppetproduction+1 -0
operations/puppetproduction+0 -1
operations/puppetproduction+2 -3
operations/puppetproduction+7 -7
operations/puppetproduction+1 -1
operations/puppetproduction+1 -0
operations/puppetproduction+0 -1
operations/puppetproduction+3 -2
operations/puppetproduction+5 -11
operations/puppetproduction+0 -1
operations/puppetproduction+4 -2
operations/puppetproduction+2 -0
operations/puppetproduction+0 -1
operations/puppetproduction+1 -3
operations/puppetproduction+1 -1
operations/puppetproduction+1 -1
operations/puppetproduction+1 -1
operations/puppetproduction+1 -1
operations/puppetproduction+0 -1
operations/puppetproduction+1 -0
operations/puppetproduction+1 -0
operations/puppetproduction+4 -2
operations/puppetproduction+1 -0
operations/puppetproduction+0 -1
operations/puppetproduction+1 -0
operations/puppetproduction+1 -0
operations/puppetproduction+4 -2
operations/puppetproduction+1 -1
operations/puppetproduction+1 -0
operations/puppetproduction+0 -1
operations/puppetproduction+0 -1
operations/puppetproduction+8 -1
operations/puppetproduction+0 -1
operations/puppetproduction+1 -0
operations/puppetproduction+3 -2
operations/puppetproduction+1 -1
operations/puppetproduction+1 -0
operations/puppetproduction+0 -1
operations/puppetproduction+3 -2
operations/puppetproduction+1 -0
operations/puppetproduction+0 -1
operations/puppetproduction+3 -2
operations/puppetproduction+1 -0
operations/puppetproduction+0 -1
operations/puppetproduction+2 -2
operations/puppetproduction+1 -0
operations/puppetproduction+5 -2
operations/puppetproduction+1 -1
operations/puppetproduction+5 -2
operations/puppetproduction+2 -0
operations/puppetproduction+0 -1
operations/puppetproduction+1 -0
operations/puppetproduction+0 -1
operations/puppetproduction+1 -0
operations/puppetproduction+8 -4
operations/puppetproduction+2 -2
operations/puppetproduction+5 -2
operations/puppetproduction+1 -0
operations/puppetproduction+1 -0
operations/puppetproduction+0 -1
operations/puppetproduction+5 -2
operations/puppetproduction+1 -0
operations/puppetproduction+1 -0
operations/puppetproduction+3 -2
operations/puppetproduction+3 -2
operations/puppetproduction+9 -2
operations/puppetproduction+1 -0
operations/puppetproduction+1 -1
operations/puppetproduction+1 -0
operations/puppetproduction+0 -1
operations/puppetproduction+4 -2
operations/puppetproduction+4 -2
operations/puppetproduction+0 -1
operations/puppetproduction+1 -0
operations/puppetproduction+3 -2
operations/puppetproduction+1 -1
operations/puppetproduction+0 -1
operations/puppetproduction+1 -0
operations/puppetproduction+5 -2
operations/puppetproduction+1 -0
operations/puppetproduction+1 -0
Show related patches Customize query in gerrit

Related Objects

StatusSubtypeAssignedTask
ResolvedMarostegui
OpenNone
Resolvedfnegri
ResolvedRobH
Resolved Bstorm
Resolved Bstorm
ResolvedMarostegui
ResolvedMarostegui
StalledNone
ResolvedNone
ResolvedMarostegui
ResolvedLegoktm
ResolvedMarostegui
ResolvedMarostegui
ResolvedMarostegui
DeclinedNone
ResolvedMarostegui
ResolvedJclark-ctr
ResolvedMarostegui
ResolvedMarostegui
ResolvedRequestwiki_willy
ResolvedMarostegui
ResolvedTrizek-WMF
Resolved Kormat
ResolvedMarostegui
Resolvedsgrabarczuk
Resolved Cmjohnson
Resolved Cmjohnson
ResolvedRobH
ResolvedMarostegui
ResolvedMarostegui
ResolvedMarostegui
ResolvedRequestwiki_willy
ResolvedRequest Cmjohnson
ResolvedRequest Cmjohnson
ResolvedRequest Cmjohnson
ResolvedRequest Cmjohnson
ResolvedRequest Cmjohnson
ResolvedRequest Cmjohnson
ResolvedRequest Cmjohnson
ResolvedRequest Cmjohnson
ResolvedRequest Cmjohnson
ResolvedRequest Cmjohnson
ResolvedRequest Cmjohnson
ResolvedRequest Cmjohnson
Resolved Cmjohnson
ResolvedRequest Cmjohnson
ResolvedRequest Cmjohnson
ResolvedMarostegui
ResolvedRequest Cmjohnson
ResolvedMarostegui
ResolvedMarostegui
Resolvedsgrabarczuk
ResolvedRequest Cmjohnson
ResolvedRequest Cmjohnson
ResolvedRequest Cmjohnson

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change 682355 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] instances.yaml: Add db1158

https://gerrit.wikimedia.org/r/682355

Change 682355 merged by Marostegui:

[operations/puppet@production] instances.yaml: Add db1158

https://gerrit.wikimedia.org/r/682355

Completed auto-reimage of hosts:

['db1124.eqiad.wmnet']

and were ALL successful.

Mentioned in SAL (#wikimedia-operations) [2021-04-26T05:47:01Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Add db1158 to dbctl, depooled, T258361', diff saved to https://phabricator.wikimedia.org/P15521 and previous config saved to /var/cache/conftool/dbconfig/20210426-054700-marostegui.json

checking tables on db1124 after the transfer from db1158

Change 682495 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] install_server: Do not format db1124

https://gerrit.wikimedia.org/r/682495

Change 682495 merged by Marostegui:

[operations/puppet@production] install_server: Do not format db1124

https://gerrit.wikimedia.org/r/682495

Change 682570 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] mariadb: Move db1125 from sanitarium to testing

https://gerrit.wikimedia.org/r/682570

Change 682570 merged by Marostegui:

[operations/puppet@production] mariadb: Move db1125 from sanitarium to testing

https://gerrit.wikimedia.org/r/682570

Script wmf-auto-reimage was launched by marostegui on cumin1001.eqiad.wmnet for hosts:

['db1125.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202104261017_marostegui_29781.log.

Change 682571 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] check_private_data_report: Remove db1125

https://gerrit.wikimedia.org/r/682571

Change 682571 merged by Marostegui:

[operations/puppet@production] check_private_data_report: Remove db1125

https://gerrit.wikimedia.org/r/682571

Completed auto-reimage of hosts:

['db1125.eqiad.wmnet']

and were ALL successful.

Change 682794 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] db1124: Enable notifications

https://gerrit.wikimedia.org/r/682794

Change 682794 merged by Marostegui:

[operations/puppet@production] db1124: Enable notifications

https://gerrit.wikimedia.org/r/682794

Change 682795 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] instances.yaml: Add db1124 to dbctl

https://gerrit.wikimedia.org/r/682795

Change 682795 merged by Marostegui:

[operations/puppet@production] instances.yaml: Add db1124 to dbctl

https://gerrit.wikimedia.org/r/682795

Mentioned in SAL (#wikimedia-operations) [2021-04-27T04:45:20Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Add db1124 to dbctl, depooled, T258361', diff saved to https://phabricator.wikimedia.org/P15540 and previous config saved to /var/cache/conftool/dbconfig/20210427-044520-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2021-04-27T04:46:10Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Add db1124 with minimal weight for the first time in s7 T258361', diff saved to https://phabricator.wikimedia.org/P15541 and previous config saved to /var/cache/conftool/dbconfig/20210427-044609-marostegui.json

Pooled db1124 with minimal weight for the first time in s7

Mentioned in SAL (#wikimedia-operations) [2021-04-27T05:08:27Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Add db1124 with minimal weight for the first time in s7 T258361', diff saved to https://phabricator.wikimedia.org/P15545 and previous config saved to /var/cache/conftool/dbconfig/20210427-050826-marostegui.json

I am automatically pooling db1124 into s7.

Mentioned in SAL (#wikimedia-operations) [2021-04-27T05:21:25Z] <marostegui> Stop mysql on db1087 to clone db1167 (lag will appear on wikidata on wikireplicas) T258361

Change 682885 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] mariadb: Productionize db1167

https://gerrit.wikimedia.org/r/682885

Change 682885 merged by Marostegui:

[operations/puppet@production] mariadb: Productionize db1167

https://gerrit.wikimedia.org/r/682885

Change 683121 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] db1167: Enable notifications

https://gerrit.wikimedia.org/r/683121

Change 683121 merged by Marostegui:

[operations/puppet@production] db1167: Enable notifications

https://gerrit.wikimedia.org/r/683121

Change 683124 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] instances.yaml: Add db1167 to dbctl

https://gerrit.wikimedia.org/r/683124

Change 683124 merged by Marostegui:

[operations/puppet@production] instances.yaml: Add db1167 to dbctl

https://gerrit.wikimedia.org/r/683124

Mentioned in SAL (#wikimedia-operations) [2021-04-28T05:51:45Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Pool db1167 in s8 T258361', diff saved to https://phabricator.wikimedia.org/P15605 and previous config saved to /var/cache/conftool/dbconfig/20210428-055144-marostegui.json

Change 683474 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] instances.yaml: Add db1156 to dbctl

https://gerrit.wikimedia.org/r/683474

Change 683474 merged by Marostegui:

[operations/puppet@production] instances.yaml: Add db1156 to dbctl

https://gerrit.wikimedia.org/r/683474

Mentioned in SAL (#wikimedia-operations) [2021-04-29T04:38:13Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Add db1156 to dbctl T258361', diff saved to https://phabricator.wikimedia.org/P15624 and previous config saved to /var/cache/conftool/dbconfig/20210429-043812-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2021-04-29T04:38:57Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Pool db1156 into s2 for the first time with minimal weight T258361', diff saved to https://phabricator.wikimedia.org/P15625 and previous config saved to /var/cache/conftool/dbconfig/20210429-043857-marostegui.json

db1156 pooled in s2 with minimal weight

Mentioned in SAL (#wikimedia-operations) [2021-04-29T04:44:58Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Pool db1156 into s2 for the first time with minimal weight T258361', diff saved to https://phabricator.wikimedia.org/P15626 and previous config saved to /var/cache/conftool/dbconfig/20210429-044458-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2021-04-29T04:50:15Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Pool db1156 into s2 for the first time with minimal weight T258361', diff saved to https://phabricator.wikimedia.org/P15627 and previous config saved to /var/cache/conftool/dbconfig/20210429-045015-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2021-04-29T04:55:57Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Pool db1156 into s2 for the first time with minimal weight T258361', diff saved to https://phabricator.wikimedia.org/P15629 and previous config saved to /var/cache/conftool/dbconfig/20210429-045557-marostegui.json

Automatically pooling db1156 into s2.

All the hosts in this task have been productionized. Pending: decommission the old ones.

Marostegui updated the task description. (Show Details)

All hosts that are scheduled for decommissioning are now ready (but waiting a few days to make sure their replacement work ok) and have their own decommissioning tasks.
Closing this as resolved.