Page MenuHomePhabricator

Security review for the language-data library
Closed, ResolvedPublic2 Estimated Story Points

Description

We're planning to integrate the language-data library into MediaWiki core.

What?

Quoting from the README file of the language-data library

This library contains language related data, and utility libraries written in PHP and Node.js to interact with that data.

Here's a link to the PHP library: https://language-data.readthedocs.io/en/latest/index.html#using-the-php-library

Volunteers contribute language information to this YAML file: https://github.com/wikimedia/language-data/blob/master/data/langdb.yaml. This script is then run to generate a JSON file: https://github.com/wikimedia/language-data/blob/master/data/language-data.json. Here's a sample PR: https://github.com/wikimedia/language-data/pull/453

This data is then loaded into the PHP library via the following code:

$this->data = json_decode( file_get_contents( __DIR__ . '/' . self::LANGUAGE_DATA_PATH ) );

Similarly in JavaScript:

const languageData = require( '../data/language-data.json' );

The library is maintained by the Language and Product Localization team at the Wikimedia Foundation.

A new version of the library is released every 6 months, but we will released more often once the library is integrated into MediaWiki.

The language-data library's PHP API currently uses mustangostang/spyc but we will eventually make this a dev dependency soon.

Related task: T190129: Consolidate language metadata into a 'language-data' library and use in MediaWiki

Why?

This will then be used to support the language selector that will also be bundled with MediaWiki core and will be extended in the future for more usage. See: T190129: Consolidate language metadata into a 'language-data' library and use in MediaWiki

How?

We are planning to load the library into MediaWiki via composer.json.
For the front-end, we may use the JS library via foreign-resources.yaml.

The inclusion method is not yet confirmed.

When?

Preferably this quarter if possible.

Related resources

Event Timeline

abi_ updated the task description. (Show Details)
Nikerabbit changed the task status from Open to In Progress.Nov 17 2025, 8:16 AM
Nikerabbit moved this task from Backlog to In Progress on the LPL Essential (FY2025-26 Q2) board.

Got clarity for next steps:

This does need a security review. Please use this form to file a task requesting the security review, so that it's tagged with the right things and includes the right information

Some changes and requests in the GitHub repo too:

Got clarity for next steps:

This does need a security review. Please use this form to file a task requesting the security review, so that it's tagged with the right things and includes the right information

Some changes and requests in the GitHub repo too:

Submitted: https://github.com/wikimedia/language-data/pull/465

Moving back to ready for dev due to lack of progress.

We've incorporated all the inputs from the security review and released a new version of the language data library. See: https://github.com/wikimedia/language-data/releases/tag/1.1.10

The above issues are also addressed in the latest release.

Nikerabbit claimed this task.
Nikerabbit moved this task from Need QA to Done on the LPL Essential (FY2025-26 Q3&4) board.