Page MenuHomePhabricator

[Bug] Merging doesn't always create a redirect
Open, LowPublic

Description

When checking items which have no statements or sitelinks, I keep coming across items which were merged but not redirected, leaving behind an empty item. I'm not sure how the users are creating the edits, but it seems like unintended behaviour.

Some examples from this month:

https://www.wikidata.org/w/index.php?title=Q9919933&action=history
https://www.wikidata.org/w/index.php?title=Q10055090&action=history
https://www.wikidata.org/w/index.php?title=Q9956781&action=history
https://www.wikidata.org/w/index.php?title=Q9964632&action=history
https://www.wikidata.org/w/index.php?title=Q9924553&action=history
https://www.wikidata.org/w/index.php?title=Q9826553&action=history

It doesn't appear to be a new problem though, I'm also finding items where the merge took place months ago, e.g.

https://www.wikidata.org/w/index.php?title=Q9918772&action=history (November)
https://www.wikidata.org/w/index.php?title=Q12296651&action=history (October)
https://www.wikidata.org/w/index.php?title=Q12149749&action=history (August)

Looking at those examples, it seems like they all still have some descriptions set, even though most of the information was removed.
Expected behaviour: The redirects are successfully created, like when using the merge gadget.


Additional information added by Esc3300:
Sitelink conflicts: Part of the merge is also performed when there is a sitelink conflict and a description conflict:

(QuickStatement merges were then done by api ). Expected behavior: items shouldn't be edited at all.

Event Timeline

Nikki raised the priority of this task from to Needs Triage.
Nikki updated the task description. (Show Details)
Nikki added a project: Wikidata.
Nikki subscribed.

When they use Special:MergeItems no redirect is created when there is a conflict with descriptions. Rest of the merge is performed. Don't know if the user gets an error message. The merge gadget has a workaround to make sure a redirect is created.

Lydia_Pintscher renamed this task from Merging doesn't always create a redirect to [Bug] Merging doesn't always create a redirect.Feb 18 2016, 12:04 PM
Lydia_Pintscher triaged this task as Medium priority.
Lydia_Pintscher moved this task from incoming to ready to go on the Wikidata board.

I'm not sure how the users are creating the edits, but it seems like unintended behaviour.

Propably with the client sitelink widget.

When they use Special:MergeItems no redirect is created when there is a conflict with descriptions. Rest of the merge is performed. Don't know if the user gets an error message. The merge gadget has a workaround to make sure a redirect is created.

Do you know this, or think? I usually get the error message and no merge is initiated.

AFAIK, sitelink widget blocks merge and reports error on statement and/or sitelink conflict but does niether on description conflict which causes these merges not to complete.

Esc3300 added subscribers: Magnus, Esc3300.

There seems to be a problem when there are sitelink conflicts and description conflicts: items get partially merged.

(some discussion about this on https://www.wikidata.org/wiki/Topic:Tblnumb80t1noodd . I had used @Magnus 's QuickStatements.

@Esc3300: I'm not sure that it's a good idea to combine your case with the problem I originally reported where the merges should have been completed, not prevented. They have different expected behaviour and I think your case should be a higher priority, which I assume would need a separate ticket.

For now I've edited the description to clarify that I'm not asking for all of the merges I originally listed to be prevented. (Phabricator doesn't clearly show that other people have edited the description, let alone added whole new paragraphs...)

I edited the summary as @Lydia_Pintscher said she prefers to have it there.

I can do a separate ticket if this is preferred.

It might be worth listing the various tools, their options and outcome (actual and expected).

ToolOptionsOutcome no conflictOutcome description conflictOutcome sitelink conflictNotesOK/NOK
Special:MergeItemsnonemerge generally completed/sometimes failsfails - not editedstops - not editedok/nok/ok
Gadgetnonemerge completedmerge completedstops - item not editedok
QuickStatementsnonemerge completed?fails - item edited, not redirectedfails - item edited, not redirectuses apiok/nok/nok
Apidescription, sitelinks, statements

https://www.wikidata.org/w/api.php?action=help&modules=wbmergeitems

So what should I set "ignore" to?
Phrased differently: With which settings will wbmergeitems fail and not edit the item, in case of non-resolvable conflict?

I'd use "description" or "description|statements".

Statements is probably to avoid that items linking to each other don't get merged.

To be safe, use "description".

Obviously, there is still the bug that sometimes the api doesn't work .. but this is what this ticket originally was about. PLbot generally fixes these afterwards.

Now switched to "ignoreconflicts:description".

I wonder if this fails because:

  • the check for the conflict doesn't always work correctly, but the merge still continues
  • the option(s) chosen for "ignoreconflicts" are forgotten somewhere during the process

When this is being fixed, maybe the feature T141845 could added as well.

This is desired behavior and applies equally to ignoreconflicts=description and ignoreconflicts=sitelink (and combinations of both).

Basically if either or both of these are set, we will just leave the conflicting piece of data (description or sitelink) on the Item we're merging from. As non-empty Items can't be turned into redirects, we can't automatically do that in the next steps (without having to "throw away" the data).

If we want to change this behavior, we need to decide what we want to do with the conflicting data in such cases.

I think it shouldn't allow a merge unless it is also going to create a redirect - either the merge is good and both parts should be done, or the merge is bad and neither should be done. Doing one but not the other always (as far as I can tell) creates a situation where someone needs to fix it afterwards.

The way the merge gadget behaves is pretty intuitive to me. That blanks any remaining descriptions before creating the redirect. There is also a bot (PLbot?) which attempts to find and fix partial merges which does the same thing. If someone is going to do a merge that ignores conflicting descriptions, removing any conflicting descriptions so that a redirect can be created would be the best option in my opinion and also the most consistent.

I can't really comment on ignoring sitelink conflicts because I would expect any merge with sitelink conflicts to be prevented. The first paragraph I wrote though would suggest that if people/things are going to be allowed to do merges which explicitly ignore sitelink conflicts, the conflicting sitelinks should be removed so that the item can be redirected.

I don't think it should abandon the merger half way through. I wonder where it was determined that this would be desired behavior.

I don't think it should abandon the merger half way through.

It doesn't, it fully completes the merge, but doesn't create the redirect.

I wonder where it was determined that this would be desired behavior.

I don't know when this was decided, but this behavior is set explicitly in the code (ChangeOpsMerge in Wikibase, in case you want to check).

I don't think it should abandon the merger half way through.

It doesn't, it fully completes the merge, but doesn't create the redirect.

Well, yes and no. There are two ways of seeing it. Currently Pasleim's bot just cleans up behind. Generally this is fine, but sometimes it isn't.

It seems that initially items were merged into one and then the other deleted. This might just be a leftover from that.

It seems that initially items were merged into one and then the other deleted. This might just be a leftover from that.

We never automatically deleted items, but users did that with scripts.

The current behaviour is exactly as it was designed, you cant technically merge something that has conflicts. If you try to merge two items that both an an enwiki sitelink then you are probably doing something wrong and should resolve the conflict before merging.

Having the merge api automatically discard conflicts would mean you could merge Q1 with Q2 and nothing would complain.

We never automatically deleted items, but users did that with scripts.

Indeed, and if a user desires this is:

  1. call wbmergeitems
  2. call wbeditentity to clear the item
  3. call wbcreateredirect
Addshore lowered the priority of this task from Medium to Low.Oct 5 2016, 9:08 AM

I think you should add :

  • 4. undo semi-"merge"

I don't think it should abandon the merger half way through.

It doesn't, it fully completes the merge, but doesn't create the redirect.

I think this is a difference in what people mean by "merge": From a user's point of view, "merging" most likely refers the whole process, so while from a technical point of view, the "merging" step is successfully completed, from the user's point of view, "merging" has been abandoned halfway through.

This is still happening frequently and has consequences on the project.

Some cases that I have recently found by chance:

  • On 6 October 2020 Wowo2008 partially merged Q1622041 into Q7157841. Apparently, the user didn't notice.
  • On 18 September 2020 Siam2019 partially merged Q9761807 into Q7006090. Apparently, the user didn't notice.
  • On 2 September 2020 ДолбоЯщер partially merged Q9891392 into Q9909061 (including a link to merge.js). Apparently, the user didn't notice.
  • On 19 August 2020 Philemonbaucis partially merged Q8619302 into Q30788505. Apparently, the user didn't notice.
  • On 9 July 2020 Gotogo partially merged Q4783136 into Q22426445. Apparently, the user didn't notice.