ContentTranslation link adaptation is mostly broken when translating to be-tarask
Open, NormalPublic

Description

Publishing to be-tarask was broken because of the renaming of the domain, but now it works.

However, now there is another odd bug:

  • Translate "Мацэ" from be to be-tarask
  • Click the first paragraph

It has a link to "Italy" (Італіі). When you click the link, two correct link cards are shown, with Italy in be and be-tarask. However, the link looks gray (unadapted) in the translation column, and when you publish the article, it is not there.

Could be a Wikidata bug (not sure, didn't debug carefully).

Amire80 created this task.Sep 11 2015, 4:27 PM
Amire80 updated the task description. (Show Details)
Amire80 raised the priority of this task from to Normal.
Amire80 added a subscriber: Amire80.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 11 2015, 4:27 PM
Amire80 updated the task description. (Show Details)Sep 11 2015, 8:16 PM
Amire80 set Security to None.
Amire80 moved this task from Needs Triage to CX6 on the ContentTranslation board.Sep 12 2015, 1:10 PM

After debugging CX a bit, I see that the langlinks API query doesn't return anything, so I opened the more general bug: T112426.

daniel added a subscriber: daniel.EditedSep 17 2015, 4:16 PM

I assume the problem arises from deriving the internal wiki id (which still is be_x_oldwiki) from the subdomain (which now is be-tarask). The relationship between subdomain and internal ID (read: database name) isn't guaranteed, and already fails for a handful of special cases.

Bene investigated this briefly, and dug up a configuration setting for ContentTranslation that handles such cases: https://git.wikimedia.org/blob/mediawiki%2Fextensions%2FContentTranslation/master/extension.json currently has:

 "ContentTranslationDomainCodeMapping": {
    "bho": "bh",
    "crh-latn": "crh",
    "gsw": "als",
    "lzh": "zh-classical",
    "nan": "zh-min-nan",
    "nb": "no",
    "rup": "roa-rup",
    "sgs": "bat-smg",
    "vro": "fiu-vro",
    "yue": "zh-yue"
}

This appears to be a mapping between language codes and subdomains. A similar mapping from (sub)domains to internal wiki IDs could be added to handle cases like be-tarask having the internal id be_x_oldwiki. That would be a quick fix.

However, such mappings are alwas a nasty workaround. Core should offer an API for mapping between domains and internal ID, see T112909 (I suppose one could use action=sitematrix, but that's not very nice). And Wikibase could allow domain names to be used instead of internal wiki IDs in API parameters, see T112910.

sitematrix has it's own horrible issue here - T111876

Amire80 moved this task from CX6 to CX7 on the ContentTranslation board.Oct 1 2015, 5:50 PM
Amire80 moved this task from CX7 to Upstream on the ContentTranslation board.Oct 16 2015, 7:00 AM

Change 270579 had a related patch set uploaded (by Amire80):
Add mapping from be-tarask to be-x-old

https://gerrit.wikimedia.org/r/270579

I assume the problem arises from deriving the internal wiki id (which still is be_x_oldwiki) from the subdomain (which now is be-tarask). The relationship between subdomain and internal ID (read: database name) isn't guaranteed, and already fails for a handful of special cases.

Bene investigated this briefly, and dug up a configuration setting for ContentTranslation that handles such cases: https://git.wikimedia.org/blob/mediawiki%2Fextensions%2FContentTranslation/master/extension.json currently has:

 "ContentTranslationDomainCodeMapping": {
    "bho": "bh",
    "crh-latn": "crh",
    "gsw": "als",
    "lzh": "zh-classical",
    "nan": "zh-min-nan",
    "nb": "no",
    "rup": "roa-rup",
    "sgs": "bat-smg",
    "vro": "fiu-vro",
    "yue": "zh-yue"
}

This appears to be a mapping between language codes and subdomains. A similar mapping from (sub)domains to internal wiki IDs could be added to handle cases like be-tarask having the internal id be_x_oldwiki. That would be a quick fix.

However, such mappings are alwas a nasty workaround. Core should offer an API for mapping between domains and internal ID, see T112909 (I suppose one could use action=sitematrix, but that's not very nice). And Wikibase could allow domain names to be used instead of internal wiki IDs in API parameters, see T112910.

I patched this temporarily, but it must be fixed upstream in Wikibase. It's important not only for getting this fully fixed, but also for other future domain renames (T21986).

I patched this temporarily, but it must be fixed upstream in Wikibase. It's important not only for getting this fully fixed, but also for other future domain renames (T21986).

Do you have a suggestion how to fix that in Wikibase? How would Wikibase know which domain refers to which internal wiki ID (i.e. database name)? Core would have to somehow provide this information. In PHP, core provides WikiMap, which offers a (hackish) way to do the opposite, namely getting the domain for a given wiki ID. But that doesn't help much.

Change 270579 merged by jenkins-bot:
Add mapping from be-tarask to be-x-old

https://gerrit.wikimedia.org/r/270579

I patched this temporarily, but it must be fixed upstream in Wikibase. It's important not only for getting this fully fixed, but also for other future domain renames (T21986).

Do you have a suggestion how to fix that in Wikibase? How would Wikibase know which domain refers to which internal wiki ID (i.e. database name)? Core would have to somehow provide this information. In PHP, core provides WikiMap, which offers a (hackish) way to do the opposite, namely getting the domain for a given wiki ID. But that doesn't help much.

Sorry, I am really not familiar with the Wikibase code that much. I guess that it's the same solution as for T112647, although it might be a circular argument...

(Verified in Content Translation in production, but please don't close the bug until the underlying Wikidata issue is resolved.)

Restricted Application added a subscriber: PokestarFan. · View Herald TranscriptJul 25 2017, 3:28 AM