Some interface messages (e.g. sitenotice, others) are loading old revisions of their messages
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Ladsgroup
	Mar 21 2019, 5:08 PM

Description

In simplewiki, the Mediawiki:Mycontris is set to "My changes" and has been changed more than nine years ago but at the top of the page, it shows "My edits":

It's not falling back to English but it's using a really old value instead so I'm confused what's going on here.

Details

Subject	Repo	Branch	Lines +/-
Fix MessagecacheTest::testLoadFromDB_fetchLatestRevision	mediawiki/core	master	+1 -3
Disable flapping MessageCacheTest::testLoadFromDB_fetchLatestRevision()	mediawiki/core	master	+2 -0
Only load latest revision in MessageCache::loadFromDB	mediawiki/core	wmf/1.33.0-wmf.22	+50 -2
Only load latest revision in MessageCache::loadFromDB	mediawiki/core	master	+50 -2

Customize query in gerrit

Related Objects
Search...

Status	Subtype	Assigned	Task
Resolved	Release	zeljkofilipin	T206676 1.33.0-wmf.22 deployment blockers
Resolved		daniel	T218918 Some interface messages (e.g. sitenotice, others) are loading old revisions of their messages
Resolved		daniel	T219042 Flaky test MessageCacheTest::testLoadFromDB_fetchLatestRevision

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

T218983 suggests that this problem exists for *all* locally overwritten interface messages. Can anyone confirm this?

That ticket also states that a null edit fixes the issue for a moment, but then it's reverted to the old bad state.

I'm gonna go ahead and say it's an off-by-one error. For
https://sv.wikipedia.org/wiki/MediaWiki:Excontent
https://sv.wikipedia.org/wiki/MediaWiki:Excontentauthor
https://sv.wikipedia.org/wiki/MediaWiki:Nstab-main
https://sv.wikipedia.org/wiki/MediaWiki:Ipboptions
https://sv.wikipedia.org/wiki/MediaWiki:Sitenotice
it seems to show the [ current - 1 ] revision.

In T218918#5047707, @daniel wrote:

I note that I see "My changes" at the top of the page on simplewiki, as expected. Has this been fixed by some kind of purge since this ticket was filed?

As I said in T218983, resaving a message does fix the issue, but only for a few seconds, or perhaps a minute.

@Nikerabbit tipped me off to the possibility of the query fetching too many revisions, rather than failing to find revisions. This is indeed the case: the rev_id = page_latest condition is lost in https://gerrit.wikimedia.org/r/c/mediawiki/core/+/496816/5/includes/cache/MessageCache.php. Fix upcoming.

Krenair subscribed.Mar 22 2019, 12:25 PM

Change 498362 had a related patch set uploaded (by Daniel Kinzler; owner: Daniel Kinzler):
[mediawiki/core@master] Only load latest revision in MessageCache::loadFromDB

https://gerrit.wikimedia.org/r/498362

gerritbot added a project: Patch-For-Review.Mar 22 2019, 12:28 PM

Change 498363 had a related patch set uploaded (by Daniel Kinzler; owner: Daniel Kinzler):
[mediawiki/core@wmf/1.33.0-wmf.22] Only load latest revision in MessageCache::loadFromDB

https://gerrit.wikimedia.org/r/498363

I noticed this on itwiki, too, so I'll provide another example for completeness: this week I have changed visualeditor-ca-editsource from "Modifica wikitesto" to "Modifica sorgente". Viewing any page will result in a quick flash of the new message, which is then suddenly replaced by the old one. While trying to debug I went to this page, which uses {{in:visualeditor-ca-editsource}} and showed the old version. Purging the page updated the message to the new one.

~~Scheduled SWAT deployment between 18:00 and 19:00 UTC.~~

No SWAT today, it's Friday. Silly me.

After the fix is live, affected messages can be fixed by purging the respective pages in the MediaWiki namespace.
We may be able to purge them in bulk, but not sure yet.

In T218918#5047848, @daniel wrote:

Scheduled SWAT deployment between 18:00 and 19:00 UTC.

There's no SWAT today (it's Friday and you put it in yesterday SWAT) but this task is UBN! giving it permission to deploy at any time.

In T218918#5047860, @Ladsgroup wrote:

There's no SWAT today (it's Friday and you put it in yesterday SWAT) but this task is UBN! giving it permission to deploy at any time.

Hahaha! ooops!

Some thoughts on how to update caches:

Calling MessageCache::load() should work, perhaps from a one-off script. There are some caveats to this:

The method is protected. Would need to use reflection to call it, or make it accessible.
The method is per language. We'd probably want to call it for the content language at least. Other languages may be less important.
MessageCache::load() grabs a lock, other threads trying to do the same may pile up. These shouldn't be too frequent, though. Also, running load() shouldn't take very long (wild guess: ten seconds).

MessageCache::clear() could also work, but then we have no control over when and how the re-caching is triggered. For all I know, it may happen during a web request, possibly causing a timeout.

abian subscribed.Mar 22 2019, 1:31 PM

MessageCache::clear would fit the bill, but I'm not sure it will fully be able to avoid cache refill storm due to web requests (it should be fine due to use of locks, but not my area of expertise). Bumping MSG_CACHE_VERSION would probably do the same.

$wgMsgCacheExpiry seems to be unaltered, default is one day, so we could just wait.

MessageCache::recache method would be ideal, but doesn't exist.

Trying to purge with manual edits is both difficult and doesn't update whole cache anyway, so that is out of question.

Perhaps one could try to write a script that temporarily makes one of the conditions on MessageCache::isCacheExpired to return false (most likely the return value of wfTimestampNow) so that it gets recached with (in)direct calls to load() but that's getting way too complicated imho.

CCicalese_WMF moved this task from Inbox to Reactive on the Multi-Content-Revisions board.Mar 22 2019, 1:43 PM

CCicalese_WMF edited projects, added Multi-Content-Revisions (Reactive); removed Multi-Content-Revisions.

Dave_Braunschweig unsubscribed.Mar 22 2019, 1:45 PM

In T218918#5047822, @Daimona wrote:

I noticed this on itwiki, too, so I'll provide another example for completeness: this week I have changed visualeditor-ca-editsource from "Modifica wikitesto" to "Modifica sorgente". Viewing any page will result in a quick flash of the new message, which is then suddenly replaced by the old one. While trying to debug I went to this page, which uses {{in:visualeditor-ca-editsource}} and showed the old version. Purging the page updated the message to the new one.

But this was on TranslateWiki, not on it.wikipedia.org directly, so if I understand @daniel’s fix correctly that shouldn’t even have been affected? [itwiki:MediaWiki:visualeditor-ca-editsource](https://it.wikipedia.org/wiki/MediaWiki:Visualeditor-ca-editsource) doesn’t exist as a page.

• Mholloway unsubscribed.Mar 22 2019, 1:47 PM

In T218918#5047968, @Lucas_Werkmeister_WMDE wrote:

In T218918#5047822, @Daimona wrote:

I noticed this on itwiki, too, so I'll provide another example for completeness: this week I have changed visualeditor-ca-editsource from "Modifica wikitesto" to "Modifica sorgente". Viewing any page will result in a quick flash of the new message, which is then suddenly replaced by the old one. While trying to debug I went to this page, which uses {{in:visualeditor-ca-editsource}} and showed the old version. Purging the page updated the message to the new one.

But this was on TranslateWiki, not on it.wikipedia.org directly, so if I understand @daniel’s fix correctly that shouldn’t even have been affected? [itwiki:MediaWiki:visualeditor-ca-editsource](https://it.wikipedia.org/wiki/MediaWiki:Visualeditor-ca-editsource) doesn’t exist as a page.

That's correct. I have to say that I didn't check the patch before writing my comment. It could be a different issue, although that would possibly be a peculiar coincidence.

In T218918#5047973, @Daimona wrote:

In T218918#5047968, @Lucas_Werkmeister_WMDE wrote:

In T218918#5047822, @Daimona wrote:

I noticed this on itwiki, too, so I'll provide another example for completeness: this week I have changed visualeditor-ca-editsource from "Modifica wikitesto" to "Modifica sorgente". Viewing any page will result in a quick flash of the new message, which is then suddenly replaced by the old one. While trying to debug I went to this page, which uses {{in:visualeditor-ca-editsource}} and showed the old version. Purging the page updated the message to the new one.

But this was on TranslateWiki, not on it.wikipedia.org directly, so if I understand @daniel’s fix correctly that shouldn’t even have been affected? [itwiki:MediaWiki:visualeditor-ca-editsource](https://it.wikipedia.org/wiki/MediaWiki:Visualeditor-ca-editsource) doesn’t exist as a page.

That's correct. I have to say that I didn't check the patch before writing my comment. It could be a different issue, although that would possibly be a peculiar coincidence.

Translatewiki.net isn't even running the code that caused this issue yet, so it can't be the same thing. Also, message transclusion is a rather different beast. Seeing a "flash" of another version indicates the usage of some JavaScript trickery to apply localization in the user language. This all sounds like it's a different issue.

@daniel Well, while it's true that some JS is altering the message, that JS isn't from a user or wiki script, as the issue still happens in safemode. Maybe VE is replacing the standard message with its own? But yes, it could be a different bug. I guess we'll see after the patch above is deployed.

In T218918#5047932, @Nikerabbit wrote:

MessageCache::recache method would be ideal, but doesn't exist.

It looks to me like MessageCache::loadFromDBWithLock should have more or less the desired effect – specifically, these lines:

$cache = $this->loadFromDB( $code, $mode );
$this->cache->set( $code, $cache );
$saveSuccess = $this->saveToCaches( $cache, 'all', $code );

It’s protected, but could be called from eval.php e. g. via TestingAccessWrapper.

Zppix subscribed.Mar 22 2019, 2:34 PM

Amorymeltzer subscribed.Mar 22 2019, 2:37 PM

AntiCompositeNumber subscribed.Mar 22 2019, 2:38 PM

Teles subscribed.Mar 22 2019, 2:39 PM

In T218918#5047715, @Nirmos wrote:

I'm gonna go ahead and say it's an off-by-one error.

For the record: it's not an off-by-one error. The problem was that the query would load all revisions, instead of just the latest revision, and write them all to the cache, all using the same key. So the last revision in the query result would win. Since no order is specified on the query, which revision will come last is undefined in theory, but in practice, it seems to always be the one with the largest revision ID. This will usually be the latest revision, so the bug doesn't occur in most cases, and wasn't caught in testing.

To reproduce the bug, I had to import an old revision, to create a revision with a higher revision ID that was still not the correct current revision. The same effect of creating an old revision with a new/high revision ID could in the past also be triggered by deleting and undeleting a page, but we have been restoring the original revision ID for a while now.

I suspect that the affected messages are ones with undeleted or imported revisions. Or the order of the result set is simply different in production, due to index optimization and query planning on large databases.

In T218918#5048073, @Lucas_Werkmeister_WMDE wrote:

It looks to me like MessageCache::loadFromDBWithLock should have more or less the desired effect – specifically, these lines:

You are right, loadFromDBWithLock() is probably the best choice. load() would not work, since it loads from the cache instead of refreshing it. And loadFromDB() itself only loads, it doesn't do anything to caches.

But then, if the bad messages have been coming back periodically (due to some cron job or job runner activity, presumably), I'd expect the good messages to be restored soon, in the same way.

FTR, even if mine is a different issue, it has the same solution: purging the local page of the message, even if it doesn't exist, fixed it.

Jdforrester-WMF merged a task: T219002: Japanese Wikipedia displaying SOPA banner on desktop.Mar 22 2019, 3:18 PM

Jdforrester-WMF added subscribers: elappen-WMF, • Jseddon.

Worth putting in Tech/News as relatively high profile.

Just FYI, Wikiversity user:Billinghurst upped the Mediawiki site id by one on Wikiversity and very old site notice disappeared at least for now.

Fixed SOPA site notice displaying on ja.wiki

Similar SOPA site notice had to be removed from Portuguese wikipedia yesterday evening.

Change 498362 merged by jenkins-bot:
[mediawiki/core@master] Only load latest revision in MessageCache::loadFromDB

https://gerrit.wikimedia.org/r/498362

There was such an issue in ptwiki with regards to sitenotice yesterday, when I performed a zero-edit (i.e. opened for editing and clicked save without changes), the banner was gone.

Change 498363 merged by jenkins-bot:
[mediawiki/core@wmf/1.33.0-wmf.22] Only load latest revision in MessageCache::loadFromDB

https://gerrit.wikimedia.org/r/498363

@Jseddon, @Rxy: Just editing MediaWiki:Sitenotice isn't good enough because it's going through the revisions and picking another one at random. You need to hide it with CSS like this: https://sv.wikipedia.org/w/index.php?diff=45286235

Incrementing the Sitenotice_id seeming to work on ja.wiki

Mentioned in SAL (#wikimedia-operations) [2019-03-22T15:58:07Z] <James_F> UBN hot-deploy for T218918: Only load latest revision in MessageCache::loadFromDB

OK, hot-fix is now emergency-deployed. Theoretically this means this should be fixed for new page impressions. It's probable that some broken pages will be in the caches, however, so logged-out users may well get the wrong experience.

ReleaseTaggerBot added a project: MW-1.33-notes (1.33.0-wmf.23; 2019-03-26).Mar 22 2019, 4:01 PM

Johan moved this task from To Triage to In current Tech/News draft on the User-notice board.Mar 22 2019, 4:19 PM

What's the status? Is the problem gone?

In T218918#5048633, @daniel wrote:

What's the status? Is the problem gone?

It works for me if I manually purge the corresponding pages in the MediaWiki namespace (what you suggested on T218918#5047858).

In T218918#5048686, @abian wrote:

It works for me by manually purging the corresponding pages in the MediaWiki namespace (what you suggested on T218918#5047858).

Can you confirm that they do not revert back to a bad state after an hour or so? THis is what was happening before. Purging always helped, but didn't stick.

Still working for me.

Provisionally declaring fixed, then.

Jdforrester-WMF edited projects, added MW-1.33-notes (1.33.0-wmf.22; 2019-03-19); removed MW-1.33-notes (1.33.0-wmf.23; 2019-03-26), Patch-For-Review.Mar 22 2019, 6:25 PM

In T218918#5048162, @daniel wrote:

Since no order is specified on the query, which revision will come last is undefined in theory, but in practice, it seems to always be the one with the largest revision ID. This will usually be the latest revision, so the bug doesn't occur in most cases, and wasn't caught in testing.

More specifically, at least on MySQL/MariaDB it usually matches the order of whatever index the DB used for the fetching of the rows. For the broken query here there are several options. Which of the several candidate indexes it picks is likely up to the details of the replica's index statistics, which can vary between replicas and might change as new revisions are inserted or deleted. If it picks rev_page_id or page_timestamp that would result in it using the latest revision by rev_id or rev_timestamp, but there are other options.

matmarex merged a task: T219049: Sitenotice error on Marathi Wikipedia.Mar 23 2019, 3:05 AM

matmarex added subscribers: Tiven2240, abhaynatu.

Mainframe98 mentioned this in T219042: Flaky test MessageCacheTest::testLoadFromDB_fetchLatestRevision.Mar 23 2019, 7:07 AM

Screenshot_2019-03-23-08-08-48.png (1×720 px, 188 KB)

Screenshot_2019-03-23-13-10-49.png (1×720 px, 89 KB)

These are two live screenshots of Sitenotice that are shown of October 2018. This is Happening from past 2 days and users are getting misguided due to this. Please fix it.

The sites are hi.wikipedia.org and mr.wikipedia.org

In T218918#5050055, @Tiven2240 wrote:

Please fix it.

Did they happen after the fix was released into production? (That is, after 2019-03-22 16:00 UTC, or 2019-03-22 19:30 IST.)

Do the problems remain for logged-out users after action=purge?

@Jdforrester-WMF Yes to both.

OK, can you give any URLs where it occurs? I can't replicate that here.

On hiwiki, I found the wrong sitenotice (this one instead of the one) on this article via Special:Random; it had been generated at an old time (2019-03-08T23:50:51, found in the page HTML). After an action=purge, the timestamp was updated (to 2019-03-23T19:19:07). Once I reloaded the page, it showed the correct (current) sitenotice.

Similarly, on mrwiki, I found the wrong sitenotice (this one instead of blank) on this article, generated at 2019-03-21T20:04:03; purging bumped it to 2019-03-23T19:24:10.

Unfortunately it's not possible for us to purge the old cached pages. A quick hack for the sitenotice would be to set .mw-dismissable-notice { display: none } in the site's Common.css until the caches finally expire in thirty days' time.

Change 498692 had a related patch set uploaded (by MaxSem; owner: MaxSem):
[mediawiki/core@master] Disable flapping MessageCacheTest::testLoadFromDB_fetchLatestRevision()

https://gerrit.wikimedia.org/r/498692

gerritbot added a project: Patch-For-Review.Mar 24 2019, 1:53 AM

Change 498692 merged by jenkins-bot:
[mediawiki/core@master] Disable flapping MessageCacheTest::testLoadFromDB_fetchLatestRevision()

https://gerrit.wikimedia.org/r/498692

ReleaseTaggerBot edited projects, added MW-1.33-notes (1.33.0-wmf.23; 2019-03-26); removed MW-1.33-notes (1.33.0-wmf.22; 2019-03-19).Mar 24 2019, 3:01 AM

Jtneill subscribed.Mar 25 2019, 4:02 AM

daniel added a subtask: T219042: Flaky test MessageCacheTest::testLoadFromDB_fetchLatestRevision.Mar 25 2019, 10:40 AM

Blocked this on T219042: Flaky test MessageCacheTest::testLoadFromDB_fetchLatestRevision. The regression test I added for the fix seems to "sometimes" fail, indicating that the problem "sometimes" still happens - maybe only in the test, maybe also in production. I'm unable to reproduce.

In any case, this ticket should not be closed until we have a regression test that reliably passes.

Change 498804 had a related patch set uploaded (by Daniel Kinzler; owner: Daniel Kinzler):
[mediawiki/core@master] DNM: Fix MessagecacheTest::testLoadFromDB_fetchLatestRevision

https://gerrit.wikimedia.org/r/498804

A fix is up for the flaky test: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/498804
I screwed up and put an invalid timestamp in, causing it to default to "now".

According to what @Jdforrester-WMF said above, the problem reported by @Tiven2240 is probably due to caching on some level. If there are no new occurrences reported, I suppose this ticket can be closed again.

Change 498804 merged by jenkins-bot:
[mediawiki/core@master] Fix MessagecacheTest::testLoadFromDB_fetchLatestRevision

https://gerrit.wikimedia.org/r/498804

Declaring this fixed for now.

Jdforrester-WMF closed subtask T219042: Flaky test MessageCacheTest::testLoadFromDB_fetchLatestRevision as Resolved.Mar 25 2019, 4:17 PM

Liuxinyu970226 unsubscribed.Mar 26 2019, 12:15 AM

CCicalese_WMF moved this task from Doing to Done with CPT on the Platform Team Workboards board.Mar 26 2019, 2:31 PM

CCicalese_WMF edited projects, added Platform Team Workboards (Done with CPT); removed Platform Team Workboards (Doing).

Johan moved this task from In current Tech/News draft to Already announced/Archive on the User-notice board.Mar 28 2019, 9:59 PM

CCicalese_WMF moved this task from Reactive to Done on the Multi-Content-Revisions board.Apr 1 2019, 1:16 PM

CCicalese_WMF edited projects, added Multi-Content-Revisions; removed Multi-Content-Revisions (Reactive).

Framawiki subscribed.May 10 2019, 6:24 PM

CCicalese_WMF edited projects, added Core Platform Team Initiatives (MCR); removed Platform Engineering (MCR).Jul 30 2019, 7:40 PM

Aklapper removed a subscriber: Anomie.Oct 16 2020, 5:38 PM

Maintenance_bot edited projects, added User-notice-archive; removed User-notice.Aug 13 2022, 12:55 PM

zeljkofilipin unsubscribed.Aug 23 2022, 10:48 AM

	F28448609: Screenshot_2019-03-23-08-08-48.png
	Mar 23 2019, 7:50 AM

	F28448612: Screenshot_2019-03-23-13-10-49.png
	Mar 23 2019, 7:50 AM

Some interface messages (e.g. sitenotice, others) are loading old revisions of their messagesClosed, ResolvedPublicActions

Description

Details

Related ObjectsSearch...

Event Timeline

Some interface messages (e.g. sitenotice, others) are loading old revisions of their messages
Closed, ResolvedPublic
Actions

Related Objects
Search...