Page MenuHomePhabricator

Sitelink tables corrupted after merge of items
Open, HighPublic

Description

There seems to be a problem with sitelinks after certain item mergers at Wikidata. Scenario: item A with sitelink "foo" is merged into item B with sitelink "bar". The corrupted post-merge situation is described as followed:

  • "foo" and "bar" appear in the merged item (Web UI, also JSON representation and so on)
  • "foo" does *not* appear in WDQS as a sitelink of any item, but "bar" does
  • "foo" does *not* appear in Wikidata's wb_item_per_site table, but "bar" does
  • "foo" does *not* appear in the local page_props table
  • page "foo" appears locally unconnected, and it can indeed be connected to another item than the merge target
  • page "bar" does see "foo" locally as an interwikilink

The sitelink situation seems pretty corrupted. This has also been reported with specific examples at Wikidata:Contact the development team, including mentions of specific cases.

investigate for 4h

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptWed, Nov 6, 7:41 AM

I can currently normally find live examples of this by looking through the English Wikipedia for commons category links missing from Wikidata - so if you need live examples when debugging this, let me know and I can provide some.

Lydia_Pintscher triaged this task as High priority.Sat, Nov 9, 1:48 AM
Lydia_Pintscher moved this task from Incoming to Ready to estimate on the Wikidata-Campsite board.

Thank you!

alaa_wmde updated the task description. (Show Details)Tue, Nov 12, 1:43 PM

I'll have a look around for some current examples, but I haven't seen any for the last week, so this may well have been solved with the deadlocks issue. Although T233520 (page_props missing links) is definitely still happening, and may be related to clearing out any old cases here as well.