Page MenuHomePhabricator

Librarize multiversion code
Open, Needs TriagePublic

Description

Multiversion should be it's own library; it's a bunch of complex PHP logic and it makes no sense whatsoever to store it in a configurations repo. Worse, MediaWiki-Vagrant needs it to provide a production-like environment, and right now that can only be done by simply copying the files and having MWV include its own fork of them, with things getting out of sync, MWV collecting weird Vagrant-specific hacks etc. Maybe also useful for third-party wikifarms if it can be made sufficiently generic.

Event Timeline

Tgr created this task.Sep 29 2017, 8:40 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 29 2017, 8:40 PM
Mainframe98 added a subscriber: Mainframe98.
Tgr added a subscriber: Anomie.Sep 30 2017, 2:05 AM

Used to be operations/mediawiki-multiversion at some point but got merged into the config repo in 2012. @Anomie do you remember the reasons for that?

Anomie added a comment.Oct 2 2017, 4:03 PM

Changes in I7ef35304 were causing dependency issues between mediawiki-config and mediawiki-multiversion in ways that made it make more sense to merge them. See code review comments there and on I58901bfd. I note since then a lot of the executables that were in multiversion have since been removed.

But the files remaining in the repo are still rather heavily dependent on WMF's infrastructure, so a true librarization may still not make much sense.

it's a bunch of complex PHP logic and it makes no sense whatsoever to store it in a configurations repo.

You could say the same thing about a bunch of other stuff in operations/mediawiki-config. The repository isn't just "configuration" in the sense you're thinking, it's also custom logic to generate parts of the configuration at runtime based on aspects of WMF's deployment environment and aspects of the request (e.g. the Host header).

bd808 added a subscriber: bd808.Oct 2 2017, 4:19 PM

Some (many?) of the differences between the production and MediaWiki-Vagrant multiversion scripts are intentional. Multiversion is neat, but it has never seemed to me to be a generic wiki farm management tool. I borrowed heavily from it for the wiki farm management system in MediaWiki-Vagrant, but mostly because it was easy. I didn't mean it to be an endorsement of Multiversion as the 'one true way' to run a wiki farm.

If convergence is desired, it might be better to put energy into https://www.mediawiki.org/wiki/Extension:MediaWikiFarm or one of the other existing attempts at a general purpose farm management system.

Seb35 added a subscriber: Seb35.Oct 7 2017, 8:51 AM

I created rEMWF extension-MediaWikiFarm mostly by copying the concepts from rOMWC Wikimedia - MediaWiki Config/multiversion and wgConf/SiteConfiguration: hierarchical configuration, per-wiki configuration cache, bootstrap to choose the MediaWiki version located in different directories, the syntax to add values to array parameters, amongst others. It is intended to create a proper farm management; it was developed primary for the MediaWiki hosting of our company, but I try to make it as general as possible without specificities.

There are other ways to do configuration management (Ansible/Puppet/whatever to create per-wiki LocalSettings, T149617) but these solutions rely on external tools (which is an additional tool, which adds another layer of complexity, particularly for small farms) and there will be always needs for:

  • the bootstrap part (only useful for multi-versions farms, but this might be a desirable property, at least for smooth upgrades but also if a farm wants to propose both the LTS and last MediaWiki for instance),
  • (for Ansible/Puppet) the switch to route to the specific configuration depending on the host, both in Web and CLI,
  • management of Composer-loaded extensions to enable per-wiki activation (there are some, mainly SemanticMediaWiki and extensions).

On the other side, and this is my humble opinion, any librarization for MediaWiki farm management should not do too much things and let specific and environment-dependent tasks to other tools (or manual procedures), for instance the creation and deletion of wikis because there are specific interactions with database, connections to Memcached/Parsoid/Citoid, configurations of Varnish/nginx/Apache/DNS, storage of files, etc.

Seb35 awarded a token.Oct 7 2017, 8:51 AM
demon added a subscriber: demon.Dec 1 2017, 6:30 PM

I don't think librarization of multiversion is the right way to go. If we want some sort of generic farming abilities, it should be in core or some library or an extension, not from repurposing multiversion (which, btw, I've slowly been trying to trim the edges of to make less complex).

I think this task should be declined.

I think I would also be of the opinion that multiversion should not be made into a standalone library.

For one thing, multiversion is very tightly integrated with the current layout of WMF code structure and may not generalize well. To that end, multiversion was one of the myriad obstacles that helped put the kibosh on plans to simplify the branching model at the WMF (T89945). That is, while multiversion is a way to run multiple wikiversions, it is certainly not the only way, nor is it a very flexible way.

Also, librarization can be viewed an implicit endorsement that this is the appropriate way to run a wiki farm whereas (in my view) multiversion has come into being as an ad hoc solution that has been built with little consideration for generalization.

Code like multiversion is definitely what makes the world go 'round: code that has grown up over the years to solve a specific problem fairly well, but many of the very specific problems that multiversion solves are fairly esoteric and very WMF-server specific.

Tgr added a comment.Dec 1 2017, 6:55 PM

The point is that production and vagrant share a fair amount of code via manual duplication which is not a great situation to be in. If it's generic enough to be useful for third parties that's an extra benefit; if not (which is apparently the case), code duplication between the config repo and vagrant is still not great. (And the config repo hosting complex PHP code is probably not that great either.)

demon added a comment.Dec 1 2017, 7:05 PM

The point is that production and vagrant share a fair amount of code via manual duplication which is not a great situation to be in. If it's generic enough to be useful for third parties that's an extra benefit; if not (which is apparently the case), code duplication between the config repo and vagrant is still not great.

Agreed, copy+pasted code isn't great. But I don't think multiversion is the right solution to this problem for Vagrant or anyone else for that matter. Plus the codebases have diverged a little (my upstream changes haven't made it back into Vagrant). If Vagrant *really* wants to use an unmolested multiversion, it's more than welcome to add the repo as a submodule ;-)

(And the config repo hosting complex PHP code is probably not that great either.)

The repo is poorly named. It hosts a lot more than just configuration, and multiversion is far from the only complex code in it (I'd argue multiversion isn't all that complex, but I digress).

Tgr added a comment.EditedDec 1 2017, 7:21 PM

Agreed, copy+pasted code isn't great. But I don't think multiversion is the right solution to this problem for Vagrant or anyone else for that matter. Plus the codebases have diverged a little (my upstream changes haven't made it back into Vagrant). If Vagrant *really* wants to use an unmolested multiversion, it's more than welcome to add the repo as a submodule ;-)

Vagrant is supposed to be close to production so multiversion seems appropriate (and not sure there are any alternatives ATM). It needs to fork it since a bunch of things are different (e.g. dynamic wiki management). If multiversion were it's own repo that would be easy because it gets few updates so it wouldn't be much effort to sync up. With multiversion living in the super-busy config repo, that's not really realistic. Git can do sparse checkouts, but I don't think there is such a thing as a sparse fork.

demon removed a subscriber: demon.Mar 16 2019, 3:33 PM