Page MenuHomePhabricator

[20hr] ‎Investigate Wikibase behavior after the site ID of the related site (via sitelink) is changed
Closed, ResolvedPublic

Description

Wikibase interacts with different wikis ("sites") - identified by the "site ID".
For a number of reasons the site ID could be changed (e.g. be_x_oldwiki is intended to be referred to as be_taraskwiki - WMF Wikipedia).
It is not clear how does the change affects Wikibase. In the past certain functionality of Wikibase has not been working correctly after the change, e.g.

  • langlinks API: T112426
  • special:setsitelink does not recognize the new site ID as a valid input

Analyze and document the current behavior related to sitelinks/sites of Wikibase when the site is renamed (i.e. its site ID has been changed), This willl serve as a base for specifying what needs fixing etc.

Note: past investigation, linked to a potential ocean of other problematic cases was T112647

Acceptance criteria:

  • Overview of all known behavior related to sitelinks/sites in Wikibase
  • Document where the list on available/knows sites comes from currently

Timebox: 20h 18h

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Wikidata datamodel allows breaking change without touching existing data, as the internal representation of entities is not a stable interface. Changing a site ID actually does not require any change of existing entities in database (only need to map deprecated language code to newer ones and stop using them for newer edits).

It was noted that for Documenting where the list on available/knows sites comes from currently it might be useful to be able to query Wikidata production databases

I think places to check would include, but not be limited to:

  • Special pages
    • SetSitelink
    • ItemByTitle
    • GoToLinkedPage
    • NewItem (it has URL parameters for creating an item that’s initially linked to a certain page)
  • APIs
    • Wikibase APIs to edit data
    • Wikibase APIs to get data
    • site+title parameters (can replace id to specify an item in various APIs)
    • langlinks, see T112426
  • Wikibase Repo sitelink editing UI
  • Wikibase Client sitelink UI? (I’ll admit I barely know this one)
  • Lua
    • getting sitelinks of an entity
      • possibly getting the language of a sitelink, similar to the langlinks API?
    • looking up an entity by title
noarave renamed this task from Investigate Wikibase behavior after the site ID of the related site (via sitelink) is changed to [20hr] ‎Investigate Wikibase behavior after the site ID of the related site (via sitelink) is changed.Dec 2 2020, 1:34 PM

Note from task inspection for possible inputs to check:

  • be_x_oldwiki
  • be_taraskwiki
  • be-x-old
  • be-tarask

Other notes that may be helpful:

  • It should be possible to test this on test.wikidata.org
  • DB tables that may or may not be related are sites and interwiki
  • These local settings might be related:
    • $wgWBRepoSettings['localClientDatabases']
    • $wgLocalDatabases

I put my investigation results at mw:User:Lucas Werkmeister (WMDE)/site ID investigation. In summary:

  • The site selection UI (when editing sitelinks on the item or when adding one on a client page) uses be_x_oldwiki but also accepts be_tarask.wikipedia.org and be-tarask.wikipedia.org.
  • The sitematrix API has a language be-tarask but with no wikis in it; the language be-x-old has the wiki be_x_oldwiki, whose URL is https://be-tarask.wikipedia.org/.
  • The interwikimap API has prefixes be-tarask and be-x-old, both of which point to https://be-tarask.wikipedia.org/.
  • The wbgetentities API and the langlinks API correctly use the be-tarask.wikipedia.org domain when generating URLs for sitelinks to that wiki.
  • Practically everything else uses be_x_oldwiki as the wiki ID or be-x-old as the language code, and nothing else. This includes the special pages SetSitelink, ItemByTitle, GoToLinkedPage, NewItem, and the Wikibase APIs and Lua interface.

Moving to peer review; if I missed anything, feel free to edit the wiki page (that’s why I put the results there). I’d say there’s 18h left in the timebox.

The investigation results match the acceptance criteria but I will leave this in review for a little time.