Page MenuHomePhabricator

MessageIndexRebuild failing
Closed, ResolvedPublic

Description

I have got multiple reports from WMF projects of the "pt-unknown-message" problem.

My new theory is that now it is caused by db replication lag, because WikiPageMessageGroup used DB_SLAVE to load the message definitions.


Version: unspecified
Severity: critical

Details

Reference
bz37647

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 12:27 AM
bzimport set Reference to bz37647.

Little bit more explanation:

  • Someone marks a page for translation
  • That causes writing new data to translate_sections
  • It also triggers message index recreation within the same request. Basically it iterates over all message groups, calling getDefinitions[1] to collect the message key, and build huge array of messagekey => message group id and storing it in some place.

[1] Failure happens here, because getDefinitions loads the sections from translate_sections using DB_SLAVE.

Maybe doing something like wfGetLB()->waitFor( wfGetLB()->getMasterPos() ) after writing the translate_sections ?

Please have a look at https://gerrit.wikimedia.org/r/11835
I can't test this myself

  • Bug 37715 has been marked as a duplicate of this bug. ***

Assuming this is fixed. Please test and either reopen the bug or mark as verified.