Page MenuHomePhabricator

Sidebar problems on mediawiki.org 1.18: some message keys displaying instead of their contents
Closed, ResolvedPublic

Description

Split out from bug 31100:

https://bugzilla.wikimedia.org/attachment.cgi?id=9095
Krinkle 2011-09-24 12:09:02 PDT

Screenshot of mw.org with broken sidebar

REOPENING. Bug is still happening for some of the sidebar entries. They're not
related to subpages of specialpages though. It's the interface / linktext.

MediaWiki is not fetching the message from NS_MEDIAWIKI as it should but
splitting out raw message keys (which is the expected fallback)

[reply] [-]
Private
Comment 6 Bawolff 2011-09-24 18:40:49 PDT

(In reply to comment #5)

Created attachment 9095 [details]
Screenshot of mw.org with broken sidebar

REOPENING. Bug is still happening for some of the sidebar entries. They're not
related to subpages of specialpages though. It's the interface / linktext.

MediaWiki is not fetching the message from NS_MEDIAWIKI as it should but
splitting out raw message keys (which is the expected fallback)

Note (might be useful to whomever is fixing), it seems it does work correctly
when you have a different language specified using uselang parameter. see my
comment on bug 31123

[reply] [-]
Private
Comment 7 Aaron Schulz 2011-09-25 00:21:33 PDT

Fixed on MW.org with a MessageCache->singleton()->clear() call in eval.php


Version: 1.18.x
Severity: major

attachment missing in source

Details

Reference
bz31177

Event Timeline

bzimport raised the priority of this task from to Unbreak Now!.Nov 21 2014, 11:53 PM
bzimport set Reference to bz31177.

Aaron, Tim & I are poking over this in IRC, including direct server poking from Aaron & Tim.

There are definitely corrupted empty entries getting into message cache -- this is what causes the visible symptoms -- though we haven't yet determined how they get in there.

One avenue we've looked at a bit is possible transitory ExternalStorage errors; a failure to fetch a revision's text while doing MessageCache::loadFromDB() would end up saving an empty cache entry, just as we see.

Aaron's unable to find any evidence in the server logs yet (should be some notes in ExternalStoreDB debug log with the offending blob ids), but we need to check logs for a few days back probably.

If it is that, this may help:
https://gist.github.com/1243845
(checks for text load failure when building the cache from DB and stores a '!TOO BIG' entry which means 'go fetch on demand', which is better than a fake entry)

Tim's also looking further into possible slave server inconsistencies etc and doing a reset of the memcache entry...

(In reply to comment #2)

One avenue we've looked at a bit is possible transitory ExternalStorage errors;
a failure to fetch a revision's text while doing MessageCache::loadFromDB()
would end up saving an empty cache entry, just as we see.

Aaron's unable to find any evidence in the server logs yet (should be some
notes in ExternalStoreDB debug log with the offending blob ids), but we need to
check logs for a few days back probably.

If it is that, this may help:
https://gist.github.com/1243845
(checks for text load failure when building the cache from DB and stores a
'!TOO BIG' entry which means 'go fetch on demand', which is better than a fake
entry)

Tim's also looking further into possible slave server inconsistencies etc and
doing a reset of the memcache entry...

The ES logs go back so little that they don't help conclude anything either way.

r98200 on trunk (r98201 on 1.18wmf1) adds the tweak above plus some logging, and also avoids saving in bogus cache entries in the case that an on-demand load down on MessageCache::getMsgFromNamespace also fails later.

We're not 100% sure this was the problem, but it should at least help gather data.

Debug logs for 'ExternalStoreDB' and 'MessageCache' over the next few days should be monitored. MessageCache will log if a text fetch failed (returned false) -- and if those were ES errors that should appear also in the ES log, but we don't seem to have enough data available from it to be sure.

However if it's another error that ends up returning an empty string... we might not catch it.

Note that the current state should be good on mediawiki.org: Tim cleared the message caches for it, so it'll either stay good or corrupt itself over time. :)

Marking as fixed (given brion's patch).