Page MenuHomePhabricator

Clean up languages/ directory in MediaWiki core (June 2019)
Open, Needs TriagePublic

Description

The code being split between includes/ and languages/ can be confusing at times and as part of T166010 this would be expected to change regardless.

This task is to start thinking about that specifically for the languages/ directory:

  • How we want to organise it going forward?
  • What (if anything) external to MW would need to be updated? Due to these classes having been in this place for a long time, it's possible various (automated) workflows might have come to depend on their precise location.
  • What do we want to do with files that cannot be autoloaded?
Easy ones

The following are presumably easy candidates to move over without much complication:

  • LanguageCode
  • MessageLocalizer
  • ConverterRule
  • LanguageConverter
  • Language
  • FakeConverter

These could go to includes/languages/, alongside the Message class which is already on the "includes" side.

Data files

The data files (those that do not declare functions or classes, but rather return data or export local variables) could go to includes/languages/data similar to what various other sub directories do already (such as includes/collation/).

Language sub classes

These could move over now as well, or we could keep them where they are for proximity with messages/ and i18n/ files. They are a nice contained unit and doesn't really stand in the way.

PHP Message files

These are neither clean data files, nor autoloaded classes. They are files that declare local variables in a to-be-included local scope. This is pretty hacky and might be time to convert to a simple static array file that returns these as keys. This should also make the reading of these files much easier to deal with.

Misc files
  • CrhExceptions

This is organised like a data file, but is actually a class with methods. Unsure where it should go.

  • Names.php and ZhConversion.php

An autoloaded class file, but stored in the data directory and uses "data" as actual part of its class name hierarchy as well. (For other components in "includes/" the data file is usually associated with files that are not in the class hierarchy but related to the files nearby). Having the class hierarchy overlap with a "data/" subdirectory might be confusing. In particular, after the PSR-4 transformation this would currently cause there to be two directories that only differ by case-sensitivity (includes/Language/Data and includes/Language/data). That's probably something we want to avoid.

Maybe we'll rename all "data" subdirectories that contain non-class to "Data" to accomodate this issue. Or maybe this could be renamed something that doesn't have "Data" in its namespace path (with back-compat alias). Or something else?

Event Timeline

Krinkle created this task.Jun 13 2019, 9:01 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 13 2019, 9:01 PM

Change 516955 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] [WIP] language: Move some language-related classes to includes/language/

https://gerrit.wikimedia.org/r/516955

Change 516955 merged by jenkins-bot:
[mediawiki/core@master] language: Move some language-related classes to includes/language/

https://gerrit.wikimedia.org/r/516955

Krinkle updated the task description. (Show Details)
Krinkle updated the task description. (Show Details)

I would be reluctant to move the data files under includes/. The fact that some of them are currently PHP files is just an implementation detail in my opinion.

Also, T190129: Consolidate language metadata into a 'language-data' library and use in MediaWiki purports to drop some of that data in favor of using a more comprehensive data library.

I would be reluctant to move the data files under includes/. The fact that some of them are currently PHP files is just an implementation detail in my opinion.

I agree. The reason I propose moving them is not related to them using PHP syntax. includes/ is where our source files reside, including any data files that a component may need to read. This includes (in MediaWiki core) files with PHP classes, PHP functions, PHP constants, any files read by PHP files such as data in cdb or json or php format, shell scripts, MIME lists, ICC profile, Firejail profile, or Mustache templates used by PHP. None these kinds of files have ever been stored elsewhere afaik, with the exception of maintenance/ and language/.

cscott added a subscriber: cscott.Jul 11 2019, 7:28 PM

Change 523009 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] tests: Move unit/languages to unit/includes/language

https://gerrit.wikimedia.org/r/523009

Change 523009 merged by jenkins-bot:
[mediawiki/core@master] tests: Move unit/languages to unit/includes/language

https://gerrit.wikimedia.org/r/523009

Change 532258 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] language: Move ConverterRule to includes/ and add unit test

https://gerrit.wikimedia.org/r/532258

Change 532258 merged by jenkins-bot:
[mediawiki/core@master] language: Move ConverterRule to includes/ and add test case

https://gerrit.wikimedia.org/r/532258