Page MenuHomePhabricator

The Great Namespaceization and Reorg
Open, LowPublic

Description

Objective

  1. MediaWiki should use namespaces in a consistent way with low maintenance overhead.
  1. Extension registration should be fast.

Problem statement

  • The use of namespaces in MediaWiki is inconsistent. Directory names are not necessarily related to namespace names.
  • The convention of listing the path of all classes in autoload.php has become unwieldy.
  • Extension registration is slow, due to the need to merge enormous class maps. Class names in the top level use generic names which risk conflicting with the PHP core.

Proposal

I propose moving all MW core classes to the MediaWiki\ namespace, with class_alias() for backwards compatibility. Autoloading would be done using PSR-4, which maps namespaces to directory names.

To work around bugs in PHP, class_alias() calls will be inserted in the file scope at the end of the file defining the new class. The autoloader will have an array mapping each legacy class name to the path of the new file. This is how class_alias() is already used in MW core. Whether this legacy autoloader is integrated with the PSR-4 autoloader or is registered separately is an implementation detail.

A notable difference between the current layout and that described by PSR-4 is that the directory names must match the namespace names, including capitalisation. We already have a MediaWiki\Linker namespace, but the files would be moved from includes/linker to MediaWiki/Linker.

Another difference is that there must only be one class per file.

MediaWiki has more directories than namespaces, and this makes sense given how directories are used in editors. Additional namespaces only really become essential when there is a risk of class name conflicts. To use PSR-4, we will have to trade off between our desire to limit the number of namespaces for usability, while still making files easy to find. The plan I detail below generally leans towards introducing additional namespaces.

There is the question of plural names. We (inconsistently) use plurals to indicate that a directory contains multiple things of a certain type, e.g. each member of includes/ is an include, each member of includes/specials is a special. This makes more sense for directories than for namespaces, given the way each is accessed: the class name MediaWiki\Specials\Activeusers would be strange to see in calling code or an error message. The namespace hierarchy is more of an ontology than a collection. I propose to use singular namespace names where the members of the namespace are are singular instance of the parent concept. So: specials -> Special, jobs -> Job, skins -> Skin. But perhaps jobqueue/utils -> JobQueue\Utils since each class also contains utils, not a single util. And perhaps services -> Services, since it has container classes, the members are not themselves services.

There is the question of whether to retain class name prefixes and suffixes. For example, we have 116 special pages, certainly enough to deserve a separate directory. But should we have MediaWiki\Special\SpecialActiveusers or MediaWiki\Special\Activeusers? I think redundant prefixes should be removed, except when the resulting name becomes very ambiguous. We should keep in mind that text editors often only display the base name.

Some directories do not make sense as modules or namespaces, and exist only out of a desire to reduce the number of PHP files in includes/. These should probably be moved into their respective parent directories. Specifically:

  • cache
  • clientpool
  • compat
  • debug
  • exception
  • json

But in the other direction, I would propose:

  • Sanitizer, MagicWord, MagicWordArray -> Parser\
  • ForkController -> Maintenance\
  • MessageCache -> Language\
  • Message, RawMessage -> Language
  • cache/localisation -> Language\LocalisationCache
  • A directory for non-class code: WebStart.php, Setup.php, DefaultSettings.php, Defines.php, GlobalFunctions.php, OutputHandler.php, NoLocalSettings.php. Namespacing of global functions and constants might be easier with PHP 7 group use, I'm not proposing that at the moment.
    • Also shell scripts, config files?
  • A namespace for the web app setup, request routing and response (Request?) MediaWiki, PathRouter, AjaxDispatcher, AjaxResponse, WebRequest, WebRequestUpload, FauxRequest, DerivativeRequest, WebResponse, OutputPage, HeaderCallback, exception/*
  • Watchlist: WatchedItem, WatchedItemQueryService, WatchedItemQueryServiceExtension
  • StubObject: StubObject, DeprecatedGlobal
  • A namespace for revision, page storage (ArticleStore?): Revision, RevisionList, MergeHistory, MovePage, HistoryBlob, LinkBatch, LinkCache,
  • A namespace for links storage: BacklinkCache, LinksUpdate, LinksDeletionUpdate
  • UserCache -> User\UserCache
  • Feed: FeedItem, ChannelFeed, RSSFeed, AtomFeed, FeedUtils

The classes in libs and utils would be better placed under the Wikimedia\ namespace, corresponding to the Composer vendor namespace, reflecting our aspirations for them to be separate from MediaWiki.

Other special cases:

  • jobqueue/jobs -> JobQueue\Job
  • includes/libs/rdbms/defines.php: migrate to namespaced constants.
  • specials -> Special
  • specials/helpers -> Special\Helper
  • specials/pagers -> Special\Pager
  • languages/classes -> MediaWiki\Language
  • languages/data/ZhConversion -> MediaWiki\Languages\Data\ZhConversion
  • maintenance: we can map MediaWiki\Maintenance to this directory for now
  • tests:
    • Some test classes are currently in the namespace they cover, I don't think that works with PSR-4. They should be in MediaWiki\Test\PHPUnit instead.
    • Test classes in MediaWiki\Test\PHPUnit, parser test runner in MediaWiki\Test\Parser
    • Classes under libs/ will be under the Wikimedia namespace, so associated tests should also be in the Wikimedia namespace
    • Map namespace MediaWiki\Test to directory tests\MediaWiki

New directory names will match namespace names, except for maintenance and tests. So for example, the language class files will not stay in languages/.

Existing namespaced extensions mostly use the top level, which I think is fine as long as the name is a distinctive product name. For generic descriptive names, I would prefer MediaWiki\Extension over MediaWiki\Extensions.

The transition should be fully scripted so that it can be used to rebase open changesets in Gerrit.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change 372521 had a related patch set uploaded (by Tim Starling; owner: Tim Starling):
[mediawiki/tools/namespaceizer@master] [WIP] Core alias list

https://gerrit.wikimedia.org/r/372521

I moved ChangesFeed from includes/changes to the Feed namespace because I think it was missorted. It doesn't call anything else in the changes directory, and nothing in the changes directory calls it. It's integrated with ChannelFeed from Feed.php.

Api\Query is an obvious one, it would have about 54 classes. It could be deprefixed.

Sounds good.

Then there are about the same number of uncomplicated ApiBase subclasses, which could be put in Api\Module or Api\Action.

Also good for the most part. Api\Action would be more consistent with Api\Query and Api\Format.

ApiMain (the entry point) and ApiPageSet wouldn't belong in there though.

At some point I should split ApiBase into Api\Base and Api\Action\Base, so things that are really action-specific aren't inherited by every module to confuse people. In the mean time, it breaks semantics but mapping ApiBase to Api\Action\Base (or Api\ActionBase if we don't put "base" in the sub-namespace) would make the future change less difficult. I have no strong opinion on whether "base" should be in the sub-namespace or not..

We could have Api\Format, that would be 9 classes if you include the base class.

Also a good idea.

The remainder would be about 20 base classes and miscellany, which could probably go directly in the Api namespace?

Sure.

Joe added a subscriber: Joe.Aug 30 2017, 10:08 AM

Regarding the question of how extensions should use namespaces:

Is MediaWiki\ExtensionName disallowed?

I'm open to discussion on that. Ironically one of the first instances of MediaWiki\ExtensionName was introduced by me, for ParserMigration. We also have Babel, Linter and UnCaptcha using that scheme. Under MediaWiki\Extensions we have Auth_remoteuser, EmailAuth, ExternalArticles¸Genealogy, OAuth, OAuthAuthentication and PageViewInfo -- I'd like to discourage this plural namespace name going forward.

Has there already been an official decision? Because change rEBOP8c136be2d1eed004961fd1baa0eb3517b805192b suggests that MediaWiki\Extension\ExtensionName is now the convention. Can anybody confirm?
This topic is pretty important to me, because the BlueSpice team is switching to namespaces rigth now. And I want to make sure we stick as close to MediaWiki core patterns as possible. At the moment we use BlueSpice\ExtensionName as pattern.

Has there already been an official decision? Because change rEBOP8c136be2d1eed004961fd1baa0eb3517b805192b suggests that MediaWiki\Extension\ExtensionName is now the convention. Can anybody confirm?
This topic is pretty important to me, because the BlueSpice team is switching to namespaces rigth now. And I want to make sure we stick as close to MediaWiki core patterns as possible. At the moment we use BlueSpice\ExtensionName as pattern.

I think BlueSpice\ExtensionName is fine. As I said above, I think it's fine to use a unique, distinctive product name as the top-level namespace, that's the most common pattern at the moment. It's only a problem if the extension name is generic and descriptive, like "Collection".

(Note that there are even some more special page helpers in core, but they have been living in the most random of places – https://gerrit.wikimedia.org/r/377652 cleans that up a little bit.)

Tgr added a comment.Sep 12 2017, 10:00 PM

re: extension namespaces: PSR-4 says that The fully qualified class name MUST have a top-level namespace name, also known as a “vendor namespace”. It doesn't define what a vendor namespace is, but the common-sense interpretation is that it must refer to an organization or group (or major opensource project, which are de facto groups). So IMO Bluespice is fine, MediaWiki is OK (Wikimedia would be more vendorish but then the relationship between Wikimedia and MediaWiki is complicated), something like Collection or Flow is problematic as those are not vendors by any stretch of imagination (and there is a good chance other libraries use that namespace as well).

Seb35 added a subscriber: Seb35.Sep 17 2017, 1:38 PM

For information, in the case MediaWiki\ExtensionName (versus MediaWiki\Extension\ExtensionName/MediaWiki\Skin\SkinName, there are at least 3 conflicts where an extension and a skin have the same name: Athena, CustomPage, Vector.

Seb35 added a comment.Sep 17 2017, 1:43 PM

The standard PSR-4 requires a map (namespace => directory). If each extension declares its map, there is no real constraint and BlueSpice (and SMW and others) can use a specific vendor without much difficulty. But if in the future most extensions are given a namespace MediaWiki\ExtensionName it could become tempting to define an implicit autoloader (MediaWiki\MyExtension\MyClass => /extensions/MyExtension/includes/MyClass.php). If it is the case, there could be these two parameters in extension.json:

  • the subdirectory where are stored the classes ("includes" in my example), possibly with a default value;
  • a specific vendor with "MediaWiki" or "MediaWiki\Extension" as default value.

I have no idea if this "implicit autoloader" would be a good thing, but I mention this now to envision a possible path and avoid a breaking change in the future (define now the parameters instead of doing maintenance latter)

Below would be (a part of) the extension.json for BlueSpiceFoundation (to take an example of a specific vendor):

{
    "name": "BlueSpiceFoundation",
    "AutoloadDirectory": "includes",
    "AutoloadVendor": "BlueSpice"
}

As extension.jsonallows to invoke the composer autoloader ("load_composer_autoloader": true) you can already make use of composers classmap feature.

Change 372521 merged by Legoktm:
[mediawiki/tools/namespaceizer@master] Core alias list

https://gerrit.wikimedia.org/r/372521

kostajh added a subscriber: kostajh.
Krinkle updated the task description. (Show Details)Mar 4 2019, 5:33 PM

Questions for TechCom and/or RFC author, to help move this forward:

  • Who are directly affected by this proposal?

Maintainers of MediaWiki core, new contributors, extension authors.

  • Expected cost or impact for them?

The proposal is expected to reduce maintenance cost for everyone, and expected to the lower barrier for new contributors.

Also, if I understand correctly, it does not impose a short-term cost for migration either, given back-compat. Although long-term, it does mean all code should be updated to use the new class names. If we were to automate this across all Gerrit-hosted repos, we could take away this cost. That would increase implementation cost slightly for the team resourcing it (CPT?). The alternative would be to distribute the migration cost, presumably over a longer period of time given likely not everyone can prioritise it immediately. This in turn would mean everyone has to work with both versions for a while, which increases day to day cost of development due to increased cognitive overhead and confusion. So doing it in automated fashion at once would be cheaper overall.

Lastly, as it appears today, there does not appear to be an intention to remove support for manual class loading. Which means if a third-party extension has uses cases that depend on manual linking of classes to files, this would still be supported. @tstarling Is that right?

  • Stakeholders we should involve at minimum before making a decision?

(TBD). I propose:

  • Maintainers of MediaWiki (gerrit/mediawiki group).
  • Extension authors and third-party developers.
  • TechCom.

While use of PSR-4 is not mandatory for extensions, I'm including third-party developers explicitly because we're likely only going to support PSR-4 in one way. And if there are problems with that way for them, then continuing to list classes manually isn't a fair thing to need to fall back on, unless the use case is for some reason out of scope for MW core to support. So we should make sure our approach satisfies any use cases they have that might differ from WMF, and if not, explicitly state why not.

Krinkle moved this task from Inbox to In progress on the TechCom board.Mar 6 2019, 9:37 PM
kchapman added a subscriber: kchapman.

Last Call ending: March 20 1pm PST (20:00 UTC, 21:00 CET)

kchapman edited projects, added TechCom-RFC (TechCom-Approved); removed TechCom-RFC.

TechCom has approved this

One concern I have with this is that it may make eval.php/shell.php more annoying to use. I’m already noticing this when working with Wikibase (having to add \Wikibase, \Wikibase\Repo etc. all the time), and I fear it will become much worse when core classes are namespaced too. For a while, the backwards compatibility aliases would solve that, but I assume we don’t want to keep those around forever. Perhaps we can have some kind of auto-use in PsySH (assuming many class names would still be unambiguous), or pre-import the most common names during shell bootstrap?

(This isn’t meant as an argument against the namespaceization, nor does it need to block it as long as we keep the compatibility aliases, but I’m curious what others think about it.)

Tgr added a comment.May 29 2019, 2:01 PM

Probably could be handled with a custom tab completer in PsySH, I'd like to look into that at some point (tab completion is frustratingly random currently - it probably doesn't help that PHP is using a crippled cléone of readline by default).

MaxSem updated the task description. (Show Details)Aug 9 2019, 3:03 AM

I boldly propose implanting this at least partially in Wikimania hackathon if anyone wants to help out.

One concern I have with this is that it may make eval.php/shell.php more annoying to use. I’m already noticing this when working with Wikibase (having to add \Wikibase, \Wikibase\Repo etc. all the time), and I fear it will become much worse when core classes are namespaced too. For a while, the backwards compatibility aliases would solve that, but I assume we don’t want to keep those around forever. Perhaps we can have some kind of auto-use in PsySH (assuming many class names would still be unambiguous), or pre-import the most common names during shell bootstrap?
(This isn’t meant as an argument against the namespaceization, nor does it need to block it as long as we keep the compatibility aliases, but I’m curious what others think about it.)

I would personally make a new php file in my IDE and copy paste the code back and forth most of the time. This is safer specially when you want to run code against production (things can go wrong so easily there).

In comparison, python also has lots of namespacing and I never heard anyone would be against because in python REPL there is no auto completion.

DannyS712 updated the task description. (Show Details)Thu, Oct 17, 6:06 AM