Page MenuHomePhabricator

mediawiki/vendor is bloated for tarball releases
Closed, ResolvedPublic

Description

The mediawiki/vendor repo has a bunch of stuff that is now really only for Wikimedia and we should re-evaluate whether we should be including all of those extraneous libraries in the tarball.

Event Timeline

It seems we already have REL branches for mediawiki/vendor, but they're currently created by branching off master (which is mostly tailored for WMF).

I'd suggest to instead create REL branches fresh with a composer.json that is based on merging the one from core + the bundled extensions.

I'd suggest to instead create REL branches fresh with a composer.json that is based on merging the one from core + the bundled extensions.

Sounds good.

We could have multiple repos or multiple branches. mediawiki vs wikimedia. But this would be more stuff to maintain..

Do we really need to use mediawiki/vendor for tarball? We can just install prod dependencies from composer.json.

Do we really need to use mediawiki/vendor for tarball? We can just install prod dependencies from composer.json.

This is true, we could. Only drawback was T137564 wherein my goal was to have a tag represent a totally usable version of MediaWiki that basically resembles the tarball extraction. But we could drop vendor from that request and only do skins/extensions.

If all this is true: why do we call it mediawiki/vendor if you don't need it outside of wikimedia?

If all this is true: why do we call it mediawiki/vendor if you don't need it outside of wikimedia?

Well, you do need part of it, just probably not all of it. The requirements back during the 1.25 release were:

  • People who download the tarball have all the composer dependencies needed to run core + bundled extensions already included
  • People who use git have some alternative to get dependencies without running composer
  • The tarball is built in an isolated environment in production, so we can't run composer during tarball build.
    • Additionally, Chris didn't like the idea of just grabbing PHP from the internet without any human review and including it in the tarball. (This was when composer was still vulnerable to a MITM attack)

And at the time I don't think we anticipated the diff between "tarball required dependencies" and "Wikimedia required dependencies" to grow so large. There's also not a real problem besides the tarball including things that most users won't need. So we could close this task as Declined and I think that would still be OK.

Overall I like Krinkle's idea of keeping master at Wikimedia dependencies but updating REL branches to strip out everything extraneous and not needed for the tarball.

  • The tarball is built in an isolated environment in production, so we can't run composer during tarball build.

The tarball is not built in an isolated environment in production, so we can run composer during tarball build.

I was about to ask if there is a documentation/policy about what is included in mediawiki/vendor compared to mediawiki/core, and then I found this task. If my understanding below is right, I will add add a note on this on mw:Composer.

In my understanding:

  • mediawiki/vendor was historically created for deployment issues for Wikimedia and hence is tightly related to Wikimedia’s production
  • mediawiki/vendor became the default vendor in tarballs during 1.29 release
  • the main issue in order to use the mediawiki/core’s composer.json is that the various librairies from {extensions,skins}/*/composer.json should be included in the main MediaWiki vendor
    • (an alternative would be to execute Composer in each extension/skin to create extensions/*/vendor directories if said extensions support this feature, but I understand it is less stable/robust.)

I just tried the tip from mw:Composer#Using composer-merge-plugin by adding:

{
    "extra": {
        "merge-plugin": {
            "include": [
                "extensions/*/composer.json",
                "skins/*/composer.json"
            ]
        }
    }
}

in main composer.json, and then executing composer update --no-dev. I have 854 extensions and 46 skins in my test installation (mostly from Gerrit, but a few from GitHub), all are in master branch and up-to-date (I pulled all extensions/skins/MediaWiki); obviously I had dependencies issues, but only some are quite critical (9 extensions, see the 5 first rounds in the paste P6481), others seems to be related to minimum stability mainly from Wikidata and SemanticMediaWiki. There are no issues with the small subset of the extensions/skins included by default in the tarball.

With this experiment, adding all the composer.json from extensions/skins works well on a small subset, but it could lead to issues when people add less-maintened extensions/skins and more importantly it creates a dependency between all extensions. Hence:

  • either a policy or best practice becomes necessary about librairies versions and/or,
  • the REL1_XX branches in the extensions/skins should be created only for compatible extensions/skins and non-compatible extensions/skins are deemed "non-compatible" (and it will create exceptions in deployment processes if REL1_XX could be nonexistant).

On the other side it will improve global quality and robustness of the extensions/skins corpus (currently if a same library is used in two extensions and the first loaded library is the old one, old classes could miss some methods called by the new version, creating fatal errors). In the short-term this solution could be used only when creating the tarball, the global dependency hell management could become a goal.

An important detail: when minimum-stability is not set (=stable) there are issues mainly from Wikidata and SemanticMediaWiki, which are two interlinked corpora of extensions. Setting minimum-stability = dev improves the situation. I should be decided if this is included in MW master version [but this would weaken MediaWiki’s continous deliverability quality, or CI should be adapted by removing this minimum-stability in tests] xor if extensions/skins are forbidden from using this minimum stability xor if intermediary solutions could be found.

So I think what we should do is drop vendor from the REL1_xx branches. Then only extensions and skins will be submodules. Vendor will continue to be a "build it yourself using composer" if you're running from git. This should solve basically all the problems here.

Change 399849 had a related patch set uploaded (by Chad; owner: Chad):
[mediawiki/core@REL1_29] Drop vendor from MW release branch

https://gerrit.wikimedia.org/r/399849

Change 399102 had a related patch set uploaded (by Chad; owner: Chad):
[mediawiki/core@REL1_27] Add REL1_27 submodules for bundled extensions/skins/vendor

https://gerrit.wikimedia.org/r/399102

Change 399852 had a related patch set uploaded (by Chad; owner: Chad):
[mediawiki/core@REL1_30] Drop vendor from MW release branch

https://gerrit.wikimedia.org/r/399852

So I think what we should do is drop vendor from the REL1_xx branches. Then only extensions and skins will be submodules. Vendor will continue to be a "build it yourself using composer" if you're running from git. This should solve basically all the problems here.

That makes sense for git users, however I specifically filed this task about the tarball... :)

So I think what we should do is drop vendor from the REL1_xx branches. Then only extensions and skins will be submodules. Vendor will continue to be a "build it yourself using composer" if you're running from git. This should solve basically all the problems here.

Well I agree this makes sense, perhaps this should be announced on mediawiki-l in case anyone is depending on the old behaviour.

So I think what we should do is drop vendor from the REL1_xx branches. Then only extensions and skins will be submodules. Vendor will continue to be a "build it yourself using composer" if you're running from git. This should solve basically all the problems here.

That makes sense for git users, however I specifically filed this task about the tarball... :)

....at which point, we'd go back to bundling from composer directly at build time.

This also raises an interesting question: should the bundled extensions be added as submodules on master as well? I think you could make a good argument for it:

  • Branching core just got a whole lot easier
  • Presumably, the add/remove submodule logic could be shared with make-wmf-branch
  • It gives greater visibility to the bundled extensions and helps make sure they're not broken at release time, heh
  • No need to maintain the list in make-release anymore

So I think what we should do is drop vendor from the REL1_xx branches. Then only extensions and skins will be submodules. Vendor will continue to be a "build it yourself using composer" if you're running from git. This should solve basically all the problems here.

That makes sense for git users, however I specifically filed this task about the tarball... :)

....at which point, we'd go back to bundling from composer directly at build time.

I think I would feel more comfortable with the suggestion from earlier in this ticket that after branching REL1_XX, we update vendor/ for only the dependencies needed to run core and tarball bundled extensions. This will let us verify that we're bundling exactly what we want (instead of trusting composer/packagist/github/etc.).

The second benefit from maintaining vendor in git is that if necessary, we can apply patches to libraries via the normal means (which we've done in the past) instead of figuring out how to get it into the build process.

Change 399102 merged by jenkins-bot:
[mediawiki/core@REL1_27] Add REL1_27 submodules for bundled extensions/skins/vendor

https://gerrit.wikimedia.org/r/399102

Change 401574 had a related patch set uploaded (by Chad; owner: Chad):
[mediawiki/tools/release@master] Stop branching vendor

https://gerrit.wikimedia.org/r/401574

Change 401574 merged by jenkins-bot:
[mediawiki/tools/release@master] Stop branching vendor

https://gerrit.wikimedia.org/r/401574

Change 399852 merged by jenkins-bot:
[mediawiki/core@REL1_30] Drop vendor from MW release branch

https://gerrit.wikimedia.org/r/399852

Change 399849 merged by jenkins-bot:
[mediawiki/core@REL1_29] Drop vendor from MW release branch

https://gerrit.wikimedia.org/r/399849

Change 401659 had a related patch set uploaded (by Chad; owner: Chad):
[mediawiki/core@master] WIP: Add bundled extensions as submodules of core

https://gerrit.wikimedia.org/r/401659

demon claimed this task.

Per T156637 : This will be fixed for all future releases by removing mw/vendor from branches and fixing makerelease.py.

Change 401659 abandoned by Chad:
WIP: Add bundled extensions as submodules of core

https://gerrit.wikimedia.org/r/401659