Project Information
- Name of tool/project: language-data
- Project home page: https://github.com/wikimedia/language-data/
- Name of team requesting review: Language and Product Localization team (LPL)
- Primary contact: @abi_ , @Nikerabbit
- Target date for deployment: Early Q3 2026 / Jan 31, 2026
- Link to code repository / patchset: https://github.com/wikimedia/language-data
- Link to scc output for general sizing of codebases (https://github.com/boyter/scc):
$ scc /home/abijeet/Projects/Wikimedia/language-data/ --exclude-dir=docs,data ─────────────────────────────────────────────────────────────────────────────── Language Files Lines Blanks Comments Code Complexity ─────────────────────────────────────────────────────────────────────────────── JSON 3 127 1 0 126 0 PHP 3 782 112 223 447 31 YAML 3 69 11 4 54 0 JavaScript 2 519 42 164 313 50 Markdown 2 500 75 0 425 0 XML 2 28 0 0 28 0 License 1 339 58 0 281 0 ─────────────────────────────────────────────────────────────────────────────── Total 16 2,364 299 391 1,674 81 ─────────────────────────────────────────────────────────────────────────────── Estimated Cost to Develop (organic) $46,402 Estimated Schedule Effort (organic) 4.28 months Estimated People Required (organic) 0.96 ─────────────────────────────────────────────────────────────────────────────── Processed 92149 bytes, 0.092 megabytes (SI) ───────────────────────────────────────────────────────────────────────────────
Description of the tool/project:
Quoting from the README file of the language-data library
This library contains language related data, and utility libraries written in PHP and Node.js to interact with that data.
Here's a link to the PHP library: https://language-data.readthedocs.io/en/latest/index.html#using-the-php-library
Volunteers contribute language information to this YAML file: https://github.com/wikimedia/language-data/blob/master/data/langdb.yaml. This script is then run to generate a JSON file: https://github.com/wikimedia/language-data/blob/master/data/language-data.json. Here's a sample PR: https://github.com/wikimedia/language-data/pull/453
This data is then loaded into the PHP library via the following code:
$this->data = json_decode( file_get_contents( __DIR__ . '/' . self::LANGUAGE_DATA_PATH ) );
Similarly in JavaScript:
const languageData = require( '../data/language-data.json' );
The library is maintained by the Language and Product Localization team at the Wikimedia Foundation.
A new version of the library is released every 6 months, but we will released more often once the library is integrated into MediaWiki.
The language-data library's PHP API currently uses mustangostang/spyc but we will eventually make this a dev dependency soon.
Related task: T190129: Consolidate language metadata into a 'language-data' library and use in MediaWiki
Description of how the tool will be used at WMF:
This will then be used to support the language selector that will also be bundled with MediaWiki core and will be extended in the future for more usage. See: T190129: Consolidate language metadata into a 'language-data' library and use in MediaWiki
Dependencies
This library has no runtime dependency but has the following dev dependencies:
npm
- eslint 8.57.0
- eslint-config-wikimedia 0.31.0
- mocha 10.6.0
composer
- ext-curl
- phpunit/phpunit 9.6.20
- mediawiki/mediawiki-codesniffer 47.0.0
- mustangostang/spyc 0.6.3
Has this project been reviewed before?
No, it hasn't.
Working test environment
- Using the PHP Library: https://language-data.readthedocs.io/en/latest/index.html#using-the-php-library
- Using the Node.js Library: https://language-data.readthedocs.io/en/latest/index.html#using-the-node-js-library
Post-deployment
Language and product localization team will continue to maintain the library.
Contacts: