Page MenuHomePhabricator

Create a language data library out of jquery.uls.data module and make it independent of jquery
Closed, ResolvedPublic

Description

The jquery.uls library provides jquery.uls.data which is a collection of language related data - script, direction, autonym, geographical information and a set of utility methods to look up this data. This was originally developed as part of ULS project and now widely used across many projects in WIkimedia.

I think it does not need to be a jquery dependent library, and should be usable even with nodejs and all. The Language selector can add it as a dependency.

Event Timeline

No, jQuery is indeed not a requirement. Node.js compatibility is more important. (Although I thought that jQuery is usable with Node.js. But again, it's not very important.)

It would also be exciting to get this to replace Names.php in core MediaWiki, and at least some of the Language.php methods Some Day.

Sounds like some of this may be redundant with things like cldr.js, Globalize and other initiatives. Perhaps we could instead evaluate the possibility of improving and adopting those?

Sounds like some of this may be redundant with things like cldr.js, Globalize and other initiatives. Perhaps we could instead evaluate the possibility of improving and adopting those?

That is one possibility. We have not done any analysis on overlap of features recently.

Globalize includes cldr.js.

Of course interpretation differs from the purpose of the langdb, but my interpretation is that it is only light-weight database having very wide coverage of languages, including the minimal data needed to identify and select a language (native name, direction, script and location (for grouping)). As far as I know no other library is aiming to do this. Same data can be get from CLDR, but it's coverage is not as big.

I would agree with Krinkle if our aim was to have general purpose JavaScript i18n library for MediaWiki.

Same data can be get from CLDR, but it's coverage is not as big.

Are you saying CLDR lacks certain data types we need (e.g. it has no field for location), or that it supports it but the data isn't filled in for some languages, or that it supports fewer languages?

  • CLDR 24: 238 languages
  • ULS: ~488
  • Ethnologue: ~7000

We do use CLDR data for geolocation suggestions. I am not aware of anyone else doing grouping by continent and script type in the way we do (just applying the raw data blindly could produce weird results).

So there is some overlap with other projects in the basic data with language direction and language autonyms which are the things mostly used outside of ULS form this dataset.

https://github.com/wikimedia/language-data now provides the langauge data that was initially bundled in jquery.uls. We required this for ContentTranslation-CXserver

Jquery.uls will start using this library. See https://github.com/wikimedia/jquery.uls/pull/271

santhosh claimed this task.