Page MenuHomePhabricator

db2042 reimage
Closed, ResolvedPublic

Description

Tables were corrupted or got corrupted on transference. There is a WARNING: Slot 0: Predictive Failure: 1I:1:11 that showed after reinstalling. This means we have lost all rc hosts on codfw.

I will try the transfer again or clone from another host and repartition.

Event Timeline

Change 320608 had a related patch set uploaded (by Jcrespo):
Depool db2048 for maintenance

https://gerrit.wikimedia.org/r/320608

Change 320608 merged by jenkins-bot:
Depool db2048 for maintenance

https://gerrit.wikimedia.org/r/320608

I have recovered the data from db2048, while both are depooled. However, that means I have to re-parttition. I will do that when replication lag goes down.

Stashbot subscribed.

Mentioned in SAL (#wikimedia-operations) [2016-11-10T09:38:27Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: wmf-config/db-codfw.php Depool db1059 - T149079. Repool db2048 T150334 (duration: 00m 50s)

Partitioning done:

Query OK, 691803971 rows affected (3 days 3 hours 22 min 9.74 sec)

We will copy to db2034 (or what substitutes it) from here.

Actually, we need to repool it still.

Change 321476 had a related patch set uploaded (by Jcrespo):
Repool db2042 after maintenance

https://gerrit.wikimedia.org/r/321476

Change 321476 merged by jenkins-bot:
Repool db2042 after maintenance

https://gerrit.wikimedia.org/r/321476