Page MenuHomePhabricator

Rebuild db2203
Closed, ResolvedPublic

Description

This host got stuck when doing a switchover. The process is in a very weird state and given it is the candidate master, it is safer to reclone and upgrade.

root@db2203:~# ps aux | grep mariadb
mysql       1510  225 77.4 436050024 408705824 ? Ssl   2024 568765:55 /opt/wmf-mariadb106/bin/mysqld

root@db2203:~# w
 06:53:30 up 174 days, 21:25,  1 user,  load average: 0.21, 0.42, 1.48
USER     TTY      FROM             LOGIN@   IDLE   JCPU   PCPU WHAT

Details

Related Changes in Gerrit:

Event Timeline

Mentioned in SAL (#wikimedia-operations) [2025-01-21T06:56:40Z] <marostegui@cumin1002> dbctl commit (dc=all): 'Depool db2216 T384273', diff saved to https://phabricator.wikimedia.org/P72169 and previous config saved to /var/cache/conftool/dbconfig/20250121-065640-marostegui.json

Icinga downtime and Alertmanager silence (ID=9654e11c-3dfa-405c-94e3-10fc5d04d96f) set by marostegui@cumin1002 for 1 day, 0:00:00 on 1 host(s) and their services with reason: rebuilding index

db2216.codfw.wmnet

Change #1113015 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] db2203: Disable notifications

https://gerrit.wikimedia.org/r/1113015

Change #1113015 merged by Marostegui:

[operations/puppet@production] db2203: Disable notifications

https://gerrit.wikimedia.org/r/1113015

Host recloned, now rebuilding tables on both db2216 and db2203