Page MenuHomePhabricator

Sitelink tables corrupted after merge of items
Closed, ResolvedPublic

Description

There seems to be a problem with sitelinks after certain item mergers at Wikidata. Scenario: item A with sitelink "foo" is merged into item B with sitelink "bar". The corrupted post-merge situation is described as followed:

  • "foo" and "bar" appear in the merged item (Web UI, also JSON representation and so on)
  • "foo" does *not* appear in WDQS as a sitelink of any item, but "bar" does
  • "foo" does *not* appear in Wikidata's wb_item_per_site table, but "bar" does
  • "foo" does *not* appear in the local page_props table
  • page "foo" appears locally unconnected, and it can indeed be connected to another item than the merge target
  • page "bar" does see "foo" locally as an interwikilink

The sitelink situation seems pretty corrupted. This has also been reported with specific examples at Wikidata:Contact the development team, including mentions of specific cases.

investigate for 4h

Event Timeline

I can currently normally find live examples of this by looking through the English Wikipedia for commons category links missing from Wikidata - so if you need live examples when debugging this, let me know and I can provide some.

I'll have a look around for some current examples, but I haven't seen any for the last week, so this may well have been solved with the deadlocks issue. Although T233520 (page_props missing links) is definitely still happening, and may be related to clearing out any old cases here as well.

Mike_Peel claimed this task.

I haven't seen any live examples of this recently, so I'm marking this as resolved.