[Task] Implement SiteIdMapper service
Open, HighPublic

Description

Wikibase needs a more flexible way to handle site ids. In particular:

For API input (for wbaddsitelink, etc) several aliases should be supported per wiki.

  • In addition to the global ID, at least the domain should be usable as a wiki id
  • it should be possible to define additional aliases for input, for use when wikis get renamed, as was recently the case for be-x-old -> be-tarask.

For manual input in the UI, at least the above aliases should be supported

  • in addition, per-group IDs/Aliases should be supported (e.g. "en" means "enwiki" in context of the "wikipedia" group)
  • these aliases should be provided to the UI by the SitesModule

For output, two "labels" should be available:

  • a long, globally unique label, which would also work as input to the UI widget and API module. The full domain name of the wiki should do.
  • a per-group shorthand, which would also work as input to the UI widget. This would usually be the language code, e.g. "en" for en.wikipedia.org

To achieve the above, we need a service (or several services) that provide the following functions:

getGlobalAliases( $globalSiteId ): string[] // all globally unique aliases for $globalId
getLocalAliases( $groupId, $globalSiteId ): string[] // all aliases unique within the given group (including the global ones)
getGlobalName( $globalSiteId ): string // the preferred name that is also a globally unique alias 
getLocalName( $groupId, $globalSiteId ): string // the preferred name that is also an alias unique in the given group

getAllGlobalAliases(): string[][] // map siteId -> list of globally unique aliases
getAllLocalAliases( $group ): string[][] // map siteId -> list of all locally unique aliases for members of the given group

resolveAlias( $alias, $group = null ): string // return the global site ID for the given alias. Local aliases are supported if $group is given.

These functions would probably be implemented on top of a SiteList. SiteList and Site may have to be extended to provide access to additional information. The schema of the sites table should be flexible enough to accommodate all we need. The information in the SiteList can be mapped as follows:

  • the global ID is used as the primary identifier, as well as a global alias (and thus also a local alias).
  • all "local ids" (navigation ids, interwiki ids) would be also count as global ids. Note the different meaning of "local" in this context
  • a site's domain name would act as a global id, as well as the "global label"
  • a site's subdomain would act as a local id, as well as the "local label" (alternatively, we could use the language code)
  • additional aliases can be stored as "extra data"
  • the site's global and local label can be overwritten by "extra data"
daniel created this task.Oct 6 2015, 5:00 PM
daniel updated the task description. (Show Details)
daniel raised the priority of this task from to Needs Triage.
daniel added a subscriber: daniel.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 6 2015, 5:00 PM
Restricted Application added a subscriber: StudiesWorld. · View Herald TranscriptNov 14 2015, 6:44 AM
daniel triaged this task as High priority.Feb 14 2016, 4:52 PM

This blocks wiki renaming, see T21986