db2124 depooled with index corruption
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	RLazarus
	Mar 7 2024, 10:15 PM

Description

Hi DBA,

22:07:45 <+icinga-wm> PROBLEM - MariaDB Replica SQL: s6 #page on db2124 is CRITICAL: CRITICAL slave_sql_state Slave_SQL_Running: No, Errno: 1034, Errmsg: Error Index for table page_props is corrupt: try to repair it on query. Default database: frwiki. [Query snipped] https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Depooling_a_replica

I've depooled it at 22:10, and I'm about to downtime (for five days, through this time Tuesday, just in case SRE Summit travel means it takes longer to look at).

Details

	Subject	Repo	Branch	Lines +/-
	db2124: Disable notifications	operations/puppet	production	+1 -0

Customize query in gerrit

Related Objects

Mentioned Here: T352010: Gradually drop old pagelinks columns

Event Timeline

RLazarus created this task.Mar 7 2024, 10:15 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 7 2024, 10:15 PM

Icinga downtime and Alertmanager silence (ID=17885a36-8547-4a13-afea-8d73c87e272d) set by rzl@cumin2002 for 5 days, 0:00:00 on 1 host(s) and their services with reason: index corruption

db2124.codfw.wmnet

RLazarus triaged this task as High priority.Mar 7 2024, 10:21 PM

I don't see anything obviously hardware-broken in logs. I notice it was just repooled yesterday after maintenance for T352010, but nothing jumps out as an obvious cause. Over to the DBAs from here, enjoy. :)

I could probably fix it right away, but I think I am going to quickly reclone it instead

• Marostegui moved this task from Triage to In progress on the DBA board.Mar 8 2024, 6:30 AM

Change 1009638 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] db2124: Disable notifications

https://gerrit.wikimedia.org/r/1009638

Change 1009638 merged by Marostegui:

[operations/puppet@production] db2124: Disable notifications

https://gerrit.wikimedia.org/r/1009638

Host recloned and being slowly repooled.
Thanks @RLazarus for addressing this incident!

Maintenance_bot moved this task from In progress to Done on the DBA board.Mar 8 2024, 7:29 AM

Maintenance_bot removed a project: Patch-For-Review.

db2124 depooled with index corruptionClosed, ResolvedPublicActions

Description

Details

Related Objects

Event Timeline

db2124 depooled with index corruption
Closed, ResolvedPublic
Actions