Page MenuHomePhabricator

MediaWiki doesn't detect edit conflicts when replica databases don't have the latest edit
Closed, ResolvedPublic

Description

It would appear that MediaWiki doesn't detect edit conflicts when replica databases don't have the latest edit.

We've spent a lot of time recently testing the behavior of edit conflicts with DiscussionTools' reply tool (T252558), and noticing some cases where it appears to overwrite the previous edit when posting two comments very quickly:

After convincing myself that our code is correct, I tried reproducing the problem using plain old wikitext editor and, lo and behold:

I was thinking it's because of replication lag because I could never reproduce this locally (where I don't have any replicas), and because we just went through a lot of pain due to other code loading data from replicas instead of master (T259855).

I went to read the code, and this looks obviously wrong:

EditPage.php
	public function internalAttemptSave( &$result, $markAsBot = false ) {  
...
		# Load the page data from the master. If anything changes in the meantime,
		# we detect it by using page_latest like a token in a 1 try compare-and-swap.
		$this->page->loadPageData( 'fromdbmaster' );
		$new = !$this->page->exists();

		if ( $new ) {
...
			# Article exists. Check for edit conflict.

			$this->page->clear(); # Force reload of dates, etc.
			$timestamp = $this->page->getTimestamp();
			$latest = $this->page->getLatest();

So we very carefully load the data from master before checking if the page already exists, and if it doesn't, we immediately clear that data and load it from the default source, which happens to be replicas.

I tried blaming but this code seems to have existed since times immemorial.

Event Timeline

matmarex created this task.Sep 1 2020, 1:48 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 1 2020, 1:48 AM
DannyS712 added a subscriber: DannyS712.

Change 623474 had a related patch set uploaded (by Bartosz Dziewoński; owner: Bartosz Dziewoński):
[mediawiki/core@master] EditPage: Use master instead of replicas when checking for edit conflict

https://gerrit.wikimedia.org/r/623474

I tried to figure out where this code comes from.

I found this commit from 2012, where the the offending line was deleted: rMW61f246aa786d: Made WikiPage recall the source of the data used to load its state..

It looks like it was re-introduced in rMW01f36b721f6f: merged latest master, which is a massive merge commit that once was on the "Wikidata" branch, and so it's terribly difficult to interpret. But I think the problem is in this diff of EditPage.php, which includes the changed lines from rMW61f246aa786d, but not the deletion: https://phabricator.wikimedia.org/source/mediawiki/change/master/includes/EditPage.php;01f36b721f6fb60000b625fd681993d50ebd0e40

As far as I can tell, this has been broken ever since.

Change 623474 merged by jenkins-bot:
[mediawiki/core@master] EditPage: Use master instead of replicas when checking for edit conflict

https://gerrit.wikimedia.org/r/623474

matmarex closed this task as Resolved.Sep 1 2020, 8:41 PM
matmarex claimed this task.
matmarex added a subscriber: daniel.

Thanks for the review @daniel.

I tested this on Beta again today and the problem seems fixed (https://en.wikipedia.beta.wmflabs.org/w/index.php?title=Talk:T252558&action=history).