Page MenuHomePhabricator

The portal page generator should not silently ignore existing wikis with no l10n file
Closed, ResolvedPublic

Description

If there’s stats for a language, but no l10n file, it should not be silently ignored, as it is now (rWPOR data/stats.js:84-86 (at 8ce0bdb7b3c406d5360e6004ea1c781507f963be)). Options:

  • Throw an error, don’t let the portal to be built without all necessary l10n data being available. Unblocking the build is as easy as creating the relevant l10n file with the content {"language-name":"<language name>"}; the language name can be got from various places, e.g. from the project list on Meta (e.g. https://meta.wikimedia.org/wiki/List_of_Wikipedias).
  • Show only a warning. This doesn’t break the developer workflow, but it’s easier to overlook.
  • Handle the case when there’s no translation, e.g. by using the language code instead of the language name. Ugly, but still better than not showing the wiki at all.
  • Remove the language-name message, leaving no required messages (and don’t skip the language if there’s no l10n file). There are zillions of places that contain native language names, why do we force people to manually maintain an n+1st copy for the portal pages on translatewiki.net? language-name-romanized and language-name-romanized-sorted are a bit trickier, but maybe those could be automated as well.

Incident caused by this silent ignorance: T319137

Event Timeline

Jdrewniak triaged this task as High priority.

@Tacsipacsi thanks for digging into this issue. I think having the build fail without the translation file would be the most straightforward solution for now. While I understand that maintaining the language-name seems burdensome, pulling these value in from a different data source willl be non-trivial. The ULS extension, which has a similar requirement (having to show the names of all available languages) pulls in language names from the PHP version of the CLDR database and has a workflow for parsing those values.

It might be a good idea to take that same approach for the portals too. In the past there have been many mis-tranlsations of the language names, specifically, when someone translates the language name from english, they often end up writing the word "english" in their language instead of the name of their language.

Change 983555 had a related patch set uploaded (by Tacsipacsi; author: Tacsipacsi):

[wikimedia/portals@master] Fall back to @wikimedia/language-data for autonyms

https://gerrit.wikimedia.org/r/983555

The ULS extension, which has a similar requirement (having to show the names of all available languages) pulls in language names from the PHP version of the CLDR database and has a workflow for parsing those values.

Actually, that workflow uses data from jquery.uls, which in turn is updated from wikimedia/language-data, which happens to have an NPM package, so I could just add that as a dependency. (It still needs to be updated from time to time, but language-data falls back to the language code as a last resort, so even without updating, the output is ugly, not missing; and updating can be done by any developer, it doesn’t require language knowledge.)

It might be a good idea to take that same approach for the portals too. In the past there have been many mis-tranlsations of the language names, specifically, when someone translates the language name from english, they often end up writing the word "english" in their language instead of the name of their language.

I didn’t take this path due to the concerns about language-name-romanized and language-name-romanized-sorted, but I added log messages when the two data sources differ, which can highlight mistranslation and other inconsistencies.

Change 983555 merged by jenkins-bot:

[wikimedia/portals@master] Fall back to @wikimedia/language-data for autonyms

https://gerrit.wikimedia.org/r/983555

Thank you for this patch @Tacsipacsi! I didn't realize that wikimedia/language-data was an NPM package, it certainly seems like the best solution for this use-case. I've verified this patch locally and it resolved some long-standing bugs with the portal page. I'm deploying the change today, Monday Jan 8.

wikipedia.org-prod.png (2×3 px, 1 MB)
wikipedia.org-language-fix.png (2×3 px, 1 MB)
prodfix
Tacsipacsi claimed this task.
Tacsipacsi added a subscriber: Jdrewniak.

@Jdrewniak You’re welcome, but did you see my yesterday comment on the Gerrit patch? I’d like to either fix the syntax error or hear a firm “don’t bother” before closing this task.

I found T306995: Migrate node-based services in production to node14, which would make ?. work. wikimedia/portals hasn’t been ticked yet.