Page MenuHomePhabricator

Prepare new candidate master for s4
Closed, ResolvedPublic

Description

Since db1238 is having some HW errors logged, and has shown strange trends, let's prepare a different candidate master.

New candidate master: db1244

Details

Related Changes in Gerrit:

Event Timeline

Right now db1238 is being recloned from db1244. db1244 is in a different row and has a clean history, so I'll probably pick that one.

Change #1058011 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] db1244: Make it candidate master

https://gerrit.wikimedia.org/r/1058011

Change #1058011 merged by Marostegui:

[operations/puppet@production] db1244: Make it candidate master

https://gerrit.wikimedia.org/r/1058011

Mentioned in SAL (#wikimedia-operations) [2024-07-30T05:20:21Z] <marostegui> Change candidate master in s4 eqiad (this is a NOOP) T371343

dbctl and orchestrator changed.

db1238 recloned, db1244 set up as candidate master. Both hosts are getting slowly automatically repooled.
db1238 HW issues can be tracked at T371342: db1238 bus critical errors

Marostegui updated the task description. (Show Details)

db1244 is in the same rack as candidate master of s6: https://fault-tolerance.toolforge.org/map?cluster=db-master-candidates

We can leave it as is, one extra is not that bad.

Yeah, we can change candidate master easily if needed. Let's leave it like that for now, we still don't know if s4 issue is fixed anyway.