Page MenuHomePhabricator

[Regression] Wikipedia Portal: Sites Missing in the List
Closed, ResolvedPublic2 Estimated Story PointsBUG REPORT

Description

This was raised several hours ago. The Wikipedia portal site (www.wikipedia.org) does not include zh (中文) in the list.

While zhwiki is not the only site facing this problem, it is the most noticeable site that is missing in the list. As a site that has 13th most articles, it should have been in the 1,000,000+ list, but it isn't. According to meta:List of Wikipedias, zhwiki is the only site missing in that list (tier).

Also, glancing over the 100,000+ list, the number also mismatches.

I well remembered that zhwiki was on that list last year. Is this a software bug, or some deliberate action?

Screenshot portal.png (899×1 px, 115 KB)

Event Timeline

This is a bug, it disappeared after a deployment yesterday, but I'm looking into this today.

Screenshot 2023-11-07 at 11.07.42 AM.png (2×2 px, 1 MB)
Screenshot 2023-11-07 at 11.07.49 AM.png (2×2 px, 1 MB)
beforeafter
Jdlrobson set the point value for this task to 2.

This issue was caused by this translation file being deleted last week https://gerrit.wikimedia.org/r/c/wikimedia/portals/+/971151/1/l10n/zh.json
(yes, I agree that action shouldn't cause a wiki to disappear) but in this event, I'm not sure if we should:

  1. restore that translation file, or
  2. use a different Chinese variants instead, e.g zh-hans.json.

This is what the translation looked like previously

Screenshot 2023-11-07 at 1.48.31 PM.png (288×740 px, 43 KB)

Change 972486 had a related patch set uploaded (by Jdrewniak; author: Jdrewniak):

[wikimedia/portals@master] Fix missing zh.wikipedia.org link

https://gerrit.wikimedia.org/r/972486

As variants of Chinese are unified as a site, please consider using (restoring) zh.json for all zh-language sites.

@Winston_Sung This is probably caused by your deletions. I think translations should remain for zh for this project.

@Winston_Sung This is probably caused by your deletions. I think translations should remain for zh for this project.

Umm... let me check.

Should be all restored on translatewiki.

Change 972486 abandoned by Jdrewniak:

[wikimedia/portals@master] Fix missing zh.wikipedia.org link

Reason:

No longer necessary

https://gerrit.wikimedia.org/r/972486

@Winston_Sung
Other portal site (like wiktionary (zh:1,408,014) , wikivoyage (zh:5,491)) have this problem. Will it maybe fix that?

Other portal site (like wiktionary (zh:1,408,014) , wikivoyage (zh:5,491)) have this problem. Will it maybe fix that?

Should be the same cause. The undeleted translations should made them back.

Change 972723 had a related patch set uploaded (by Winston Sung; author: Winston Sung):

[wikimedia/portals@master] Partially revert "Localisation updates from https://translatewiki.net."

https://gerrit.wikimedia.org/r/972723

Change 972723 merged by jenkins-bot:

[wikimedia/portals@master] Partially revert "Localisation updates from https://translatewiki.net."

https://gerrit.wikimedia.org/r/972723

Looks like it's been fixed now for zhwiki, good job.

Should this ticket be closed as resolved, or leave it open for other potentially missing sites, as mentioned in the description?

I think it should remain open until related issues being fixed.

For example, unexpected deletion again:

Change 973764 had a related patch set uploaded (by L10n-bot; author: L10n-bot):

[wikimedia/portals@master] Localisation updates from https://translatewiki.net.

https://gerrit.wikimedia.org/r/c/wikimedia/portals/+/973764

Jdlrobson subscribed.

Jan to provide screenshot to verify this is fixed and then move to sign off (and resolve)

Also, it seems that some languages with deprecated language codes are also missing from the portal, for example, lzh , nan.

Also, it seems that some languages with deprecated language codes are also missing from the portal, for example, lzh , nan.

@Winston_Sung that issue is captured in T319138, so I think this ticket can focus on the zh issue specifically, but I agree that it needs to be fixed. Unfortunately, that part of the code has been unmaintained for a long time and fixing it would be a large endeavour. I've had success maintaining this repo as a Google Summer of Code project in the past, so I think I'll try that again this year to hopefully deal with some of these long-standing issues.

That issue is captured in T319138, so I think this ticket can focus on the zh issue specifically.

Yeah, then I think it's okay to close this task (when the zh issue got completely resolved) and track that issue in T319138: The portal page generator should not silently ignore existing wikis with no l10n file instead.

But still need to wait until this issue got resolved:

I think it should remain open until related issues being fixed.

For example, unexpected deletion again:

Change 973764 had a related patch set uploaded (by L10n-bot; author: L10n-bot):

[wikimedia/portals@master] Localisation updates from https://translatewiki.net.

https://gerrit.wikimedia.org/r/c/wikimedia/portals/+/973764

Jdlrobson renamed this task from Wikipedia Portal: Sites Missing in the List to [Regression] Wikipedia Portal: Sites Missing in the List.Nov 14 2023, 5:37 PM
Winston_Sung changed the task status from Open to In Progress.Nov 16 2023, 5:14 PM

Change 974975 had a related patch set uploaded (by L10n-bot; author: L10n-bot):

[wikimedia/portals@master] Localisation updates from https://translatewiki.net.

https://gerrit.wikimedia.org/r/c/wikimedia/portals/+/974975

Looks like no longer being deleted for now.


Jan to provide screenshot to verify this is fixed and then move to sign off (and resolve)

@Nikerabbit @Jdrewniak @Winston_Sung

I believe zh.json is last modified by me in T171647, so probably I can bring some helpful info here. I may not remember all details though.

The usage of portal's l10n file is kind of mixed. Some parts of it are used as wiki names/descriptions in the circular area and are statically embedded into HTML ("usage 1"). The other parts are used for interface localization performed by JavaScript according to browser language ("usage 2").

And for zhwiki (and wikis with LanguageConverter) this logic is more complex. We have zh.json + additional data in l10n-overrides.json which is for usage 1, and zh-hans.json + zh-hant.json for usage 2. There may be corresponding fields for usage 1 in these 2 files, but I don't think they are used.

The reason why both zh.json and l10n-overrides.json are required is because zh locale is ambiguous (Simplified or Traditional? can't tell) and it's better to apply fully localized names/descriptions according to browser language. The content is applied conditionally: if the browser language is zh-hans, zhwiki's name and description are replaced with Simplified Chinese content in l10n-overrides.json; if it is zh-hant, these fields are replaced with Traditional Chinese content; lastly, as a fallback, these fields use zh.json content which is in the format of Simplified / Traditional, just as T350670#9314022 shows.

To conclude, zh.json contains necessary metadata (usage 1) of zhwiki and should coexist with zh-hans.json and zh-hant.json, which is only for usage 2.

Obviously this is not a good design. Given the fact that portal's code is largely unmaintained, I believe it needs a complete overhaul. That's out of this task's scope, though.