Page MenuHomePhabricator
Paste P5613

RFC meeting 2017-06-21 T166010 Namespaceization

Authored by tstarling on Jun 22 2017, 12:54 AM.
DanielK_WMDE> #startmeeting
<wm-labs-meetbot> DanielK_WMDE: Error: A meeting name is required, e.g., '#startmeeting Marketing Committee'
<DanielK_WMDE> #topic The Great Namespaceization and Reorg
<DanielK_WMDE> #link
<DanielK_WMDE> TimStarling: hey.
<DanielK_WMDE> So, what are the main questions you want to discuss today?
<TimStarling> ping anomie, legoktm, SMalyshev, MaxSem
<MaxSem> pong
<TimStarling> one question is deprefixing, SpecialAllpages -> Special\Allpages
<MaxSem> AllPages pretty please
<TimStarling> if you don't do deprefixing, then the patch would be much smaller and simpler to merge
<Scott_WUaS> (hey)
<TimStarling> it would just be a matter of adding namespace and use declarations
<MaxSem> we really need to fix all those little things. otherwise we will have to break things again
<DanielK_WMDE> In the case of SpecialFoo, I'm for removing the prefix. But that could be done later, in a second step
<MaxSem> (or live in an imperfect world!) :P
<DanielK_WMDE> But I'm not in favor of removing all prefixes everywhere. A bit of redundancy is nice, and juggeling three classes with the sdame name from different packages is confusing.
<MaxSem> that's a very valid concern indeed
<TimStarling> and if you have multiple classes of the same name, you can't import them all into a file with their normal names
<MaxSem> examples?
<DanielK_WMDE> TimStarling: yes. so you give them aliases. and they get different aliases in different files.
<DanielK_WMDE> that makes code harder to read
<TimStarling> or you import the namespace
<TimStarling> so in this RFC, I brashly (without much thought) promised to script the whole transition
<tgr> I like the convention that the class name tells you what the class does
<TimStarling> which may or may not be possible
<tgr> as opposed to it being hidden somewhere in the namespace chain
<TimStarling> if you do script it, you can't necessarily pick really good aliases for conflicting class names
<MaxSem> tgr, so MediaWiki\Api\ApiQueryAllUsers?
<DanielK_WMDE> tgr: what it does, or what makes it different from others of its kind.
<tgr> MaxSem: I wouldn't mind that
<DanielK_WMDE> what I don't want to end up with is something like MediaWiki/Model/Wikitext/Hander, MediaWiki/Model/Css/Hander, MediaWiki/Model/JSON/Hander, ...
<legoktm> hello
<DanielK_WMDE> ...but I'm happy with MediaWiki\Api\Query\AllUsers
<MaxSem> DanielK_WMDE, MediaWiki/Model/WikitextHandler
<MaxSem> brrrrrrr, I HATE php's namespace separators so much!
<DanielK_WMDE> MaxSem: yes, better, but assumes all handlers are in the same component. which they really shouldn't be.
<DanielK_WMDE> Yea, backslashes as separators... who came up with that...
<TimStarling> it came from a massive discussion on php-internals
<MaxSem> I disagree that we should have namespaces consisting of 1-2 classes just for the sake of structure
<TimStarling> the most epic bikeshedding imaginable
<TimStarling> I think there was a vote in the end
<DanielK_WMDE> hehe
<TimStarling> doing the C++ thing of using :: for both namespaces and class members was ruled out for technical reasons
<TimStarling> so they had to introduce a new operator
<legoktm> I like keeping some basic class info like MediaWiki\Api\ApiQueryAllUsers. Otherwise I think we're going to alias stuff anyways for code readability, so we might as well do it in the class name
<SMalyshev> oops forgot about this, reading the backlog
<legoktm> MediaWiki\Linker\LinkRenderer vs MediaWiki\Linker\Renderer (latter will always need to be aliased)
<DanielK_WMDE> legoktm: i'm in favor of dropping the Api bit. but it really has to be decided on a namespace-per-namespace basis
<TimStarling> SMalyshev: I know it's a bit late for changing the PHP engine, since we'd have to wait years for it to become our minimum supported version
<DanielK_WMDE> legoktm: yea, former is better
<TimStarling> but would it really hurt to run the autoloader on a type hint failure? you're about to throw a fatal error anyway, so surely it is not a performance problem
<DanielK_WMDE> wouldn't it also run in cases where no fatal is thrown, and the class is found on the second attempt?
<legoktm> Yeah, I agree it needs to be on a case by case basis. For example you typically don't instantiate API modules directly or call them from other code, you really only reference their class name in extension.json.
<SMalyshev> DanielK_WMDE: I hate backslashes too, and the refusal of :: was based on tiny corner case which imho was completely possible to work around, but the community decided otherwise
<TimStarling> but legoktm, you're OK with deprefixing of special pages?
<legoktm> TimStarling: I think I'm coming around to it, yes.
<DanielK_WMDE> TimStarling: about that script of yours - how does it identify class names? Would it change class names in comments and string literals? Should it?
<legoktm> I can't think of a case where you'd want to use both Api\AllPages and Special\AllPages in the same file
<TimStarling> nonexistent script
<SMalyshev> MediaWiki\Api\QueryAllUsers seems ok to me... in general it should leave enough context to make sure it's clear what we're talking about, but can leave out the parts that are there only for disambiguation
<TimStarling> and no it won't change string literals, better to use ::class
<DanielK_WMDE> ...but doxygen blocks...
<TimStarling> yeah doxygen has to be handled somehow doesn't it?
<DanielK_WMDE> my question is really: full parsing, or just smart-ish search and replace?
<legoktm> doxygen's handling of namespaces is pretty poor
<SMalyshev> most string literals should use ::class, but extension.json is one example where it is not possible
<DanielK_WMDE> but parsing and modifying json is simple
<DanielK_WMDE> parsing php is not
<TimStarling> well, PHP has token_get_all() which I've used before in a couple of projects
<legoktm>;468ae96d5dfee1dd2a309e80e3608cd509179ed9$58 is what we had to do to get doxygen to work properly
<SMalyshev> parsing json is simple, but one needs to be very careful to keep all json using correct names when it changes...
<TimStarling> there was a thing called ConfEditor which did automatic edits of structured data encoded in PHP
<DanielK_WMDE> ...from a web request?
<DanielK_WMDE> i bet.
<TimStarling> and stylize.php, which is now replaced by phpcs
<SMalyshev> you can parse php with this: but it's 7+
<TimStarling> phpcs itself uses token_get_all() and transforms the token stream without building an AST
<TimStarling> and legoktm suggested looking at Phan, which does build an AST
<legoktm> phan uses the php-ast extension which SMalyshev linked
<SMalyshev> there's also
<SMalyshev> phan uses php-ast iirc
<TimStarling> right
<DanielK_WMDE> tokenization isn't entirely sufficient - you'd need to handle the use clauses to know which class is meant, and whether it shhould be changed or not
<DanielK_WMDE> also doable. more work. maybe not worth it
<TimStarling> if something is namespaced already then it doesn't need to be changed, right?
<TimStarling> the script will scan for class names that are on a list of legacy aliases
<TimStarling> and then add use statements for any that are found
<DanielK_WMDE> #info support for de-prefixing SpecialFoo and ApiFoo. Needs to be decided per namespace.
<DanielK_WMDE> #info Tim is contemplating an automatic conversion script, based on token_get_all() or some other PHP parsersing utility
<TimStarling> <SMalyshev> MediaWiki\Api\QueryAllUsers seems ok to me...
<SMalyshev> I'd probably go for a script that fixes easiest 90% and tells about the remaining 10% and fix those manually
<TimStarling> that particular example is misleadingly clear
<TimStarling> we also have ApiMain, so then you have Api\Main?
<SMalyshev> Api\Main is probably not good - what is "Main"?
<legoktm> Another minor point of PSR-4 is the one class per file requirement. I think that's pretty uncontroversial right? We can also have PHPCS enforce this ahead of the great namespaceization
<TimStarling> ApiLogin, ApiBase
<SMalyshev> QueryAllUsers is pretty clear, it queries all users, but "main"?
<MaxSem> rename it to Api\Dispatcher or something?
<TimStarling> ApiHelp, ApiResult
<DanielK_WMDE> maybe Api\ApiMain, but Api\Modules\AllUsers?
<DanielK_WMDE> there will be many edge cases to decide
<tgr> if written from scratch I would go for something like Api\MainModule, Api\Query\AllUsersModule
<tgr> not sure we should go wild on all the weirdly named MediaWiki classes at the same time, though
<MaxSem> not in the same commit, that's for sure
<DanielK_WMDE> TimStarling: what about the directory structure? My thinking is that includes/ would correspond to MediaWiki\. Anything that belongs into Wikimedia\ should not be under includes/.
<MaxSem> Another question: this is a wonderful opportunity to go 2.0, right? This will break some compatibility no matter how hard we try, so there is a logical reason to bump the major number per semver
<DanielK_WMDE> by the same token - perhaps languages/ should move into includes/?
<TimStarling> nobody really liked my idea of using MediaWiki/ for MediaWiki\ ?
<DanielK_WMDE> MaxSem: I'd rather drop the 1., and go for 31.
<tgr> can we use src/? it's sort of standard
<SMalyshev> having "mediawiki" directory inside mediawiki sounds redundantly redundant :)
<bd808> +1 for src/ instead of includes/
<DanielK_WMDE> +1
<DanielK_WMDE> well... hm.
<MaxSem> +1 to keep includes
<bd808> I'd love to see the webroot moved into a subdir too
<DanielK_WMDE> src/MediaWiki and src/Wikimedia and src/Languages?
<TimStarling> there's also the question of what to do about non-class files like WebStart.php
<TimStarling> maybe WebStart.php in particular should stay in includes/ for now for b/c
<DanielK_WMDE> #info preference of src/ over MediaWiki/ as the top level dir. Maybe src/MediaWiki/ would be an option
<tgr> I would map MediaWiki\ to src/ and maybe Wikimedia\ to src/lib/ (hopefully that's just temporary until they get vendorized)
<TimStarling> pity we don't have Krinkle, he was just talking about bd808's idea
<tgr> so Mediawiki\Foo\Bar is src/Foo/Bar.php
<bd808> having all the main business logic classes inside the webroot just creeps me out :)
<TimStarling> I think it would be good for security
<DanielK_WMDE> tgr: i prefer src/Wikimedia over src/libs
<MaxSem> that would break lots of end users though
<TimStarling> it's probably a separate project
<MaxSem> yeah
<TimStarling> you know dbarratt is thinking along the same lines, that's why it came up in the committee meeting last hour
<DanielK_WMDE> #idea <bd808> I'd love to see the webroot moved into a subdir too
<DanielK_WMDE> #info <TimStarling> [re changing webroot] it's probably a separate project
<TimStarling> is it OK to have supporting files like tidy.conf be near to their class files in the directory structure?
<DanielK_WMDE> i like that, yea
<Scott_WUaS> (DanielK_WMDE: would adding subsequent languages to Wikimedia's 358 - ie re "perhaps languages/ should move into includes/?" - impact this at all? That is, would adding all 7,099 -358 languages eventually impact this?)
<bd808> Scott_WUaS: (no)
<Scott_WUaS> (Thanks)
<DanielK_WMDE> Scott_WUaS: this is purely about where which php file shuld be. it does not influence functionality at all.
<TimStarling> so for languages, it makes sense to move the class files into src in the natural location
<Scott_WUaS> Great
<TimStarling> but what about the rest of languages?
<DanielK_WMDE> Languages/i18n/ could move to the top level. Languages/data etc could perhaps move into i18n/?
<bd808> I've used data/i18n in other php projects and also had things like data/db and data/templates
<TimStarling> I think it makes sense to hide enormous directories like i18n down a couple of levels
<DanielK_WMDE> TimStarling: do you think directory moves and namespace and name changes should be comitted together, or should they be kept separate?
<bd808> Not sure there are lots of strong conventions for this
<TimStarling> it can probably be separate
<tgr> extensions already have a top-level i18n so might as well go with that
<TimStarling> true
<DanielK_WMDE> #info keep changes to dir structure in separate changes from namespace changes, for reviewer's sanity
<TimStarling> we do need a data directory
<MaxSem> Small Q: what will happen to includes/installer/i18n?
<bd808> putting i18n at the top level is a nice signal that it is very important for the product
<DanielK_WMDE> a general one, or one specifically for i18n data?
<TimStarling> we have serialized, which is now almost a generic data directory since it contains a cdb file
<SMalyshev> yeah it was kinda weird for me that i18n in mediawiki is not in the same place as in exts
<DanielK_WMDE> #info <bd808> putting i18n at the top level is a nice signal that it is very important for the product
<TimStarling> any comments on MaxSem's idea of incrementing the major version number?
<bd808> it would be a strong signal of breaking back compat certainly
<TimStarling> but we are not really breaking it
<DanielK_WMDE> i think it would make more sense to make all mw releases major versions. we already do breaking changes there
<DanielK_WMDE> just drop the 1. prefix
<MaxSem> TimStarling, or so we thought
<legoktm> I agree with DanielK_WMDE
<TimStarling> MaxSem's reasoning is "I'm sure you'll break something"
<TimStarling> maybe he is right but it's not really a goal
<bd808> MediaWiki 2k17.1 :)
<MaxSem> yup!
<DanielK_WMDE> ...and when we change the web root, we go to 3.0?...
<MaxSem> R31 Service pack 3
<bd808> Is there a reason to try hard for back compat? We just shipped an LTS right?
<DanielK_WMDE> we could also asymptotically approach sqrt(2)...
<TimStarling> we could do a firefox and go from 1.30 to 2.0, then increment the major version in each release after that
<MaxSem> I'm not opposed to incrementing major, but don't want to jump all the way to 31.x
<MaxSem> *to always incrementing major
<legoktm> bd808: because if we break back-compat, we also have to fix all of our extensions which is a lot of work
<DanielK_WMDE> #info thoughts about bumping the major version number to indicate a big break
<legoktm> I think we should handle the version number issue in a separate discussion, it's pretty tangential
<DanielK_WMDE> MaxSem: why not?
<DanielK_WMDE> yea.
<DanielK_WMDE> 12 minute warning
<DanielK_WMDE> TimStarling: any technicalities about class_alias?
<MaxSem> DanielK_WMDE, look how Linus did that
<TimStarling> I'd like it if PHP and HHVM could do one or both of the following:
<TimStarling> 1. autoload on type hint failure
<TimStarling> 2. allow class_alias() to work with a non-existent destination and autoload the destination on first request
<SMalyshev> 1. not going to happen probably...
<TimStarling> but like I say, it is a bit late for this projec
<SMalyshev> 2. if somebody finds *how* to do it...
<DanielK_WMDE> what's the deal with type hint failure?
<SMalyshev> DanielK_WMDE: it's mainly in the situation where class X is loaded but its alias is not
<TimStarling> as I said on the task: "Usually when you pass an object to a type-hinted function, the class is already defined, as are its parent classes, on account of the fact that you have created an instance of it. And similarly, the assumption is that if the class is undefined when the type-hinted function is entered, there must be no existing instances and the type hint must have failed. These assumptions fail when class aliases are used."
<SMalyshev> I think the best way to solve it is to always load the class and all aliases together
<TimStarling> SMalyshev: not going to happen because class_alias is not favoured as a good feature?
<TimStarling> without class_alias(), there is no way we would be considering this namespaceization project
<TimStarling> so I for one am glad it exists
<SMalyshev> TimStarling: well, class_alias is kind of hack but mainly because that would probably require also change for instanceof, etc. and that may be a performance issue
<DanielK_WMDE> ah, i see.
<SMalyshev> and really the only use case for this is aliases, which are not that important to mess with the whole universe of class loading just to fix the issue that is fixed by loading aliases together with a class
<SMalyshev> IMHO of course but given it didn't happen so far...
<DanielK_WMDE> #info <SMalyshev> I think the best way [...] is to always load the class and all aliases together
<SMalyshev> I'm not sure does composer and out autoload gen scripts deal with aliases properly?
<SMalyshev> i.e. do they generate entries for both?
<legoktm> our current autoload generator handles class_alias, it generates entries for both
<TimStarling> right, we haven't talked about the actual autoloader implementation, note this is for core so we're not tied to a particular autoloader
<DanielK_WMDE> legoktm: but does composer?
<DanielK_WMDE> #info <legoktm> our current autoload generator handles class_alias, it generates entries for both
<TimStarling> my idea is to first run a PSR-4 compliant step, not necessarily the actual PSR-4 library
<legoktm> I haven't tested composers, it wouldn't be hard to do, or patch it to do so I think
<DanielK_WMDE> legoktm:
<TimStarling> and then if it fails, check a legacy array generated similarly to how it is now
<DanielK_WMDE> seems to be just that
<TimStarling> so classes that are in their proper place according to PSR-4 will be removed from the legacy array
<bd808> composer had rejected bugs and patches for class_alias() stuff already -- --
<Scott_WUaS> (Cheers, All)
<DanielK_WMDE> #info [custom class loader] Tim: first run a PSR-4 compliant step, [...] if it fails, check a legacy array generated similarly to how it is now
<TimStarling> it's not a big deal unless we follow dbarratt's idea and have core loaded via composer
<TimStarling> but that seems unlikely
<DanielK_WMDE> oh - if i understand correctly, that composer plugin allows for a centralized alias map, with no need for manually adding class_alias in hundreds of files
<DanielK_WMDE> may be worth considiering...
<DanielK_WMDE> ok, we are out of time
<DanielK_WMDE> any last comments? things to log?
<TimStarling> thanks everyone, that was helpful
<DanielK_WMDE> cheers!
<DanielK_WMDE> #endmeeting
<quiddity> (note, meetbot didn't trigger successfully, so you'll need to record the notes manually.)