Page MenuHomePhabricator

Do not overwrite edits when conflicting with self
Open, Needs TriagePublic

Description

Currently, MediaWiki will ignore conflicts between edits of the same user, causing previous edits to be overwritten, if there were no edits by other users since the edit's base revision (and the current edit is not a section edit). The relevant code is at https://phabricator.wikimedia.org/source/mediawiki/browse/master/includes/EditPage.php;56ade0650d492e01d2689dc918e7db3d56d18147$1997

This behavior seems undesirable: If I have the same page open in two tabs for some reason, and edit in both (possibly after hours or days), the second edit will undo the first edit, unless someone else has edited in the meantime. There seems to be no documented justification for this.

History:
Originally, conflict detection would be disabled if the current user was the creator of the edit's base revision, no matter if other users edited. This behavior can be found the first revision of MediaWiki on record in git, Lee Daniel Crocker's "Initial revision" from 2003: https://phabricator.wikimedia.org/source/mediawiki/browse/master/includes/Article.php;d82c14fb4fbac288b42ca5918b0a72f33ecb1e69$419

Later, brion changed this so merging would work on section edits, but conflicts on whole page edits would still overwrite: https://phabricator.wikimedia.org/rMWc6e870e5c6f85a693e01c0c16ce027b4b7e45480

In 2008, Aaron added a check that made sure the current user was the only one to edit after the base revision: https://phabricator.wikimedia.org/rMW43129ccd4426f3d059b7851cc8acc02360070601

This was rather tricky to track down, since the method was moved from Article to EditPage and then later to Revison (by me).

But the big question is: WHY do we do that? It seems like we are going out of our way to do the wrong thing in an edge case. Am I missing some obvious benefit of this?

Event Timeline

daniel created this task.Sep 12 2017, 8:14 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 12 2017, 8:14 PM

Historical justification, as per Brion on IRC:

[22:20] <brion> So
[22:20] <brion> Back in the day if you used the edit form and saved
[22:21] <brion> And then clicked back and made a further change and save again
[22:21] <brion> It would edit conflict
[22:21] <brion> I think this was a hack for that
[22:21] <brion> But yeah if it fucks you up it's... not pretty
[22:21] <DanielK_WMDE> ah... that would work because "back" took you back to a text field that still had the modified text, which you could then adjust?
[22:22] <DanielK_WMDE
> this was before tabbed browsing, yes?
[22:22] <DanielK_WMDE__> this seems to be doing more harm then good...
[22:22] <brion> Yeah Netscape 4 days
[22:22] <brion> It may be... Unwise now.

I'd be OK with changing this, but it'll likely have some fallout with editing tools relying on this behaviour.

Change 377813 had a related patch set uploaded (by Daniel Kinzler; owner: Daniel Kinzler):
[mediawiki/core@master] Do not ignore edit conflicts with oneself.

https://gerrit.wikimedia.org/r/377813

I remember suffering silly self-conflicts. An additional issue is that ~~~~ magic expansion almost ensures that there will be a conflict, even if it could otherwise be automatically resolved.

aaron explained it quite well on https://phabricator.wikimedia.org/rMW43129ccd4426f3d059b7851cc8acc02360070601

It's not strange at all to notice a typo in what you wrote just after pressing Save (ie. while it is connecting with the server). So what do you do? You press Esc to stop it hoping that you were fast enough. Most probably, you weren't (it was saved), and your amended edit will be a self-conflict.

That seems more common than editing the same article on two different tabs without noticing. Maybe we should try to detect if it was from the same or a different tab?

Related tasks: T2275 T3150

I just recently accidentally overwrote my own changes when using two tabs. I think it would be better to ask confirmation before overwriting on these cases rather than doing it silently.

If we still want to cover the "back button" use case, I think we should try and detect that situation on the client side.

A bit of JS could set a flag in the form just before it is submitted for saving, so the form knows the contents of the text field have already been saved. It can then warn and offer to load the latest version into the editor when the edit field gets focus again.

With JS disabled, you'd get a conflict if you edit the same paragraph again. With JS on, you would be able to do the same thing you do now, with one extra click, and without the danger of accidentally reverting yourself.

Tagging for TechCom review. Would be great to move this forwards somehow.

tstarling moved this task from Inbox to Backlog on the TechCom board.Jul 18 2018, 8:37 PM

Change 377813 abandoned by Daniel Kinzler:
Do not ignore edit conflicts with oneself.

Reason:
attic

https://gerrit.wikimedia.org/r/377813

daniel added a comment.EditedMar 9 2020, 3:59 PM

The problem that prompted the "ignore-self-conflict" behavior to be implemented seems to no longer exist.
I just tested this locally (Firefox 73, Chromium 80, with and without JS): when I edit a page, and then hit the "back" button to go back to the edit page, the edit form is apparently re-loaded, so it not only has the new text, but it also has the new values in the editRevId and wpEdittime fields. It doesn't matter whether the same user or a different user edits in the mean time, hitting "back" to gets back to the edit page will load the edit page with the data for the current value, as expected.

This indicates that ignoring self-conflicts is no longer needed to prevent users from experiencing spurious conflicts. So the only thing it does is cause data loss when users edit in multiple tabs.

awight added a subscriber: awight.Mar 10 2020, 7:48 AM

I'm in favor of dropping the old behavior. Even if it were still possible to get in tho this use case "noticed a typo, cancel form submission, fix typo and resubmit" which justifies ignoring a self-conflict, the conflict interface has improved to a point where it's perfectly reasonable to expect an editor to resolve the conflict explicitly. Assuming that we know the user's intention, without any heuristics to distinguish between the typo use case and accidentally editing in two tabs, seems unwise.

Here's what the typo case will look like to users (simulated by having a second editor make the conflicting edit). The second edit added the text "additional" on a non-overlapping and an overlapping line:

Legacy conflict workflow,

Two-column conflict workflow,

tstarling moved this task from Inbox to Backlog on the TechCom board.Jul 18 2018, 22:37

Was there a TechCom discussion? I didn't find anything in the IRC logs from June through September, 2018. If there's no dissent beyond the historical notes in this task, it seems uncontroversial to go ahead with the change.

My only suggestion would be to leave the deprecated logic in place for now, but to remove the isConflict = false line and instead log the base revision ID, so we can correlate it with conflict EventLogging.

tstarling moved this task from Inbox to Backlog on the TechCom board.Jul 18 2018, 22:37

Was there a TechCom discussion? I didn't find anything in the IRC logs from June through September, 2018. If there's no dissent beyond the historical notes in this task, it seems uncontroversial to go ahead with the change.

This was not an RFC discussion, just quick internal review during board grooming (note that this is on the committee board, not the RFC board). I just dug the discussion up from the archives, the relevant statement was: "Not a purely technical decision, need input from product/UX perspective from a user point of view". So, blocked on a product decision.

My only suggestion would be to leave the deprecated logic in place for now, but to remove the isConflict = false line and instead log the base revision ID, so we can correlate it with conflict EventLogging.

Doing a potentially expensive database query just for the logging seems overkill. Who would look at these logs, and for what purpose?

Doing a potentially expensive database query just for the logging seems overkill. Who would look at these logs, and for what purpose?

Good points, I mostly wanted to satisfy my own curiosity about whether ignoring self-conflicts is a misfeature or not. What I would have done is, record when these happen, and take diffs showing what was silently reverted or added. This would prove or falsify the theoretical typo use case, or show that these self-conflicts have stopped altogether.

The operational overhead might not be an issue since the code has been in place since the beginning of our epoch, and it's on a lukewarm or stone cold, server-side code path. But I'm assuming you're more concerned about the tech debt, and I completely agree. Just throwing it out is an excellent first step! If anyone asks for a revert, that's the point at which we can add logging.

Krinkle moved this task from Backlog to Watching on the TechCom board.Mar 18 2020, 8:08 PM

For the record, this was discussed in the TechCom meeting two weeks ago. No concerns were raised, there are no longer any technical reasons to keep this behavior.

JTannerWMF added a subscriber: JTannerWMF.

It looks like there is no action for the Editing Team.