Page MenuHomePhabricator

ContentTranslation link adaptation is mostly broken when translating to be-tarask
Open, MediumPublic

Description

Publishing to be-tarask was broken because of the renaming of the domain, but now it works.

However, now there is another odd bug:

  • Translate "Мацэ" from be to be-tarask
  • Click the first paragraph

It has a link to "Italy" (Італіі). When you click the link, two correct link cards are shown, with Italy in be and be-tarask. However, the link looks gray (unadapted) in the translation column, and when you publish the article, it is not there.

Could be a Wikidata bug (not sure, didn't debug carefully).

Related Objects

Event Timeline

Amire80 raised the priority of this task from to Medium.
Amire80 updated the task description. (Show Details)
Amire80 subscribed.

After debugging CX a bit, I see that the langlinks API query doesn't return anything, so I opened the more general bug: T112426.

I assume the problem arises from deriving the internal wiki id (which still is be_x_oldwiki) from the subdomain (which now is be-tarask). The relationship between subdomain and internal ID (read: database name) isn't guaranteed, and already fails for a handful of special cases.

Bene investigated this briefly, and dug up a configuration setting for ContentTranslation that handles such cases: https://git.wikimedia.org/blob/mediawiki%2Fextensions%2FContentTranslation/master/extension.json currently has:

 "ContentTranslationDomainCodeMapping": {
    "bho": "bh",
    "crh-latn": "crh",
    "gsw": "als",
    "lzh": "zh-classical",
    "nan": "zh-min-nan",
    "nb": "no",
    "rup": "roa-rup",
    "sgs": "bat-smg",
    "vro": "fiu-vro",
    "yue": "zh-yue"
}

This appears to be a mapping between language codes and subdomains. A similar mapping from (sub)domains to internal wiki IDs could be added to handle cases like be-tarask having the internal id be_x_oldwiki. That would be a quick fix.

However, such mappings are alwas a nasty workaround. Core should offer an API for mapping between domains and internal ID, see T112909 (I suppose one could use action=sitematrix, but that's not very nice). And Wikibase could allow domain names to be used instead of internal wiki IDs in API parameters, see T112910.

sitematrix has it's own horrible issue here - T111876

Change 270579 had a related patch set uploaded (by Amire80):
Add mapping from be-tarask to be-x-old

https://gerrit.wikimedia.org/r/270579

I assume the problem arises from deriving the internal wiki id (which still is be_x_oldwiki) from the subdomain (which now is be-tarask). The relationship between subdomain and internal ID (read: database name) isn't guaranteed, and already fails for a handful of special cases.

Bene investigated this briefly, and dug up a configuration setting for ContentTranslation that handles such cases: https://git.wikimedia.org/blob/mediawiki%2Fextensions%2FContentTranslation/master/extension.json currently has:

 "ContentTranslationDomainCodeMapping": {
    "bho": "bh",
    "crh-latn": "crh",
    "gsw": "als",
    "lzh": "zh-classical",
    "nan": "zh-min-nan",
    "nb": "no",
    "rup": "roa-rup",
    "sgs": "bat-smg",
    "vro": "fiu-vro",
    "yue": "zh-yue"
}

This appears to be a mapping between language codes and subdomains. A similar mapping from (sub)domains to internal wiki IDs could be added to handle cases like be-tarask having the internal id be_x_oldwiki. That would be a quick fix.

However, such mappings are alwas a nasty workaround. Core should offer an API for mapping between domains and internal ID, see T112909 (I suppose one could use action=sitematrix, but that's not very nice). And Wikibase could allow domain names to be used instead of internal wiki IDs in API parameters, see T112910.

I patched this temporarily, but it must be fixed upstream in Wikibase. It's important not only for getting this fully fixed, but also for other future domain renames (T21986).

I patched this temporarily, but it must be fixed upstream in Wikibase. It's important not only for getting this fully fixed, but also for other future domain renames (T21986).

Do you have a suggestion how to fix that in Wikibase? How would Wikibase know which domain refers to which internal wiki ID (i.e. database name)? Core would have to somehow provide this information. In PHP, core provides WikiMap, which offers a (hackish) way to do the opposite, namely getting the domain for a given wiki ID. But that doesn't help much.

Change 270579 merged by jenkins-bot:
Add mapping from be-tarask to be-x-old

https://gerrit.wikimedia.org/r/270579

I patched this temporarily, but it must be fixed upstream in Wikibase. It's important not only for getting this fully fixed, but also for other future domain renames (T21986).

Do you have a suggestion how to fix that in Wikibase? How would Wikibase know which domain refers to which internal wiki ID (i.e. database name)? Core would have to somehow provide this information. In PHP, core provides WikiMap, which offers a (hackish) way to do the opposite, namely getting the domain for a given wiki ID. But that doesn't help much.

Sorry, I am really not familiar with the Wikibase code that much. I guess that it's the same solution as for T112647, although it might be a circular argument...

(Verified in Content Translation in production, but please don't close the bug until the underlying Wikidata issue is resolved.)