Page MenuHomePhabricator

Re-build db2097 s1 and s6
Closed, ResolvedPublic

Description

db2097 had a memory issue (T225378) that made s6 to crash (s1 didn't).
Even though a data checksum on s6 revealed no drifts, we should probably rebuild it

Event Timeline

Marostegui renamed this task from Re-build db2097 s1 and s6 to Re-build db2097 s1 and s6 with Debian Buster and 10.3.Jul 22 2019, 5:21 AM
Marostegui triaged this task as Medium priority.
Marostegui moved this task from Triage to Backlog on the DBA board.
Marostegui renamed this task from Re-build db2097 s1 and s6 with Debian Buster and 10.3 to Re-build db2097 s1 and s6.Dec 31 2019, 12:19 PM
Marostegui updated the task description. (Show Details)

I have renamed this, and removed the 10.3 bits

db2097 is a backup source host, so probably best not to experiment with 10.4 there until we are sure it works fine. That 10.4 test will be a goal for the next Q, so for now we can just rebuilt s1 and s6 as it is, for now or decline this task.
I will defer that decision to @jcrespo, I am fine either way (just rebuild s1 and s6 or just close this task). As mentioned, the compare.py revealed no drifts.

I agree we should keep the backup sources either the same version as the master or as >50% of the replicas (aka "upgrade it at the same time as the master"). However, if we decide to change a whole section (e.g. s1 and s6 on codfw only) to a higher version, we could also upgrade the backup source of that dc to test the backup workflow on the new version.

jcrespo claimed this task.

This was rebuilt on second crash: T252492 (that is why it took me so much time to send it to dc ops). CC @Marostegui