Page MenuHomePhabricator

`mediawiki-core-php72-phan-docker` job runs `composer install` instead of using packages from mediawiki/vendor
Closed, DeclinedPublic

Description

https://gerrit.wikimedia.org/r/c/mediawiki/core/+/708170 (And all other core patches but not vector) is failing.

This should have been fixed by https://gerrit.wikimedia.org/r/c/mediawiki/vendor/+/708058 but wasn't

Build saved: https://integration.wikimedia.org/ci/job/mediawiki-core-php72-phan-docker/51585/

There was a bad package version of parsoid merged into mediawiki-vendor, but the CI breakage persisted even after mediawiki-vendor was reverted, because the CI job was doing a "composer install".

This is Not Good. It means that parsoid (and presumably other libraries) are "released" to CI *as soon as they are tagged*, not when the patch is merged into mediawiki-vendor. That bypasses all the gate CI on mediawiki-vendor, among other issues.

Most libraries in core's composer.json are pinned to specific versions, *but not all*. All unpinned packages are going to leak new versions into CI in the same manner as parsoid. Doing an uncontrolled composer install in our CI seems like a bad idea.

Event Timeline

RhinosF1 triaged this task as Unbreak Now! priority.Jul 26 2021, 9:34 PM
RhinosF1 created this task.

includes/resourceloader/VueComponentParser.php:101 PhanRedefinedClassReference Saw reference to \RemexHtml\DOM\DOMBuilder declared at vendor/wikimedia/dodo/.phan/stubs/DOMBuilder.php:24 which is also declared at vendor/wikimedia/remex-html/RemexHtml/DOM/DOMBuilder.php:21. This may lead to confusing errors. It may be possible to exclude the class that isn't used with exclude_file_list. In addition to normal ways to suppress issues, this issue type can be suppressed on either of the class definitions if it is impractical to exclude one file.
22:32:34 includes/resourceloader/VueComponentParser.php:105 PhanRedefinedClassReference Saw reference to \RemexHtml\DOM\DOMBuilder declared at vendor/wikimedia/dodo/.phan/stubs/DOMBuilder.php:24 which is also declared at vendor/wikimedia/remex-html/RemexHtml/DOM/DOMBuilder.php:21. This may lead to confusing errors. It may be possible to exclude the class that isn't used with exclude_file_list. In addition to normal ways to suppress issues, this issue type can be suppressed on either of the class definitions if it is impractical to exclude one file.
22:32:34 includes/resourceloader/VueComponentParser.php:105 PhanTypeMismatchReturn Returning $domBuilder->getFragment() of type \Wikimedia\Dodo\Node but parseHTML() is declared to return \DOMDocument

I am confused by these failures. That reverted patch on vendor should have completely removed all references to Dodo from vendor, so i don't know why those CI jobs are even running into Dodo at all!

Parsoid not being alpha 9 from composer is answer

Okay, tagged a new version of Parsoid as -a10 and that seems to have fixed the problem.

I don't think core CI's should ever be running composer install blindly, it should be using the versions pinned in mediawiki-vendor.

We shouldn't close this bug until we nail down what was going on here. Paging @Jdforrester-WMF ?

ssastry lowered the priority of this task from Unbreak Now! to Medium.Jul 26 2021, 10:55 PM
cscott renamed this task from Core CI failing with Vendor phan issues to `mediawiki-core-php72-phan-docker` job runs `composer install` instead of using packages from mediawiki/vendor.Jul 26 2021, 11:03 PM
cscott updated the task description. (Show Details)

Change 588479 had a related patch set uploaded (by C. Scott Ananian; author: C. Scott Ananian):

[mediawiki/tools/phan@master] Exclude stubs from .phan configurations in libraries

https://gerrit.wikimedia.org/r/588479

Change 708178 had a related patch set uploaded (by C. Scott Ananian; author: C. Scott Ananian):

[mediawiki/libs/Dodo@master] Don't export .phan/ or tools/ directories in composer package

https://gerrit.wikimedia.org/r/708178

Change 708179 had a related patch set uploaded (by C. Scott Ananian; author: C. Scott Ananian):

[mediawiki/libs/IDLeDOM@main] Don't export .phan/ in composer package

https://gerrit.wikimedia.org/r/708179

Change 708179 merged by jenkins-bot:

[mediawiki/libs/IDLeDOM@main] Don't export .phan/ in composer package

https://gerrit.wikimedia.org/r/708179

Change 708178 merged by jenkins-bot:

[mediawiki/libs/Dodo@master] Don't export .phan/ or tools/ directories in composer package

https://gerrit.wikimedia.org/r/708178

Change 708194 had a related patch set uploaded (by C. Scott Ananian; author: C. Scott Ananian):

[mediawiki/services/parsoid@master] Upgrade wikimedia/dodo to 0.2.0

https://gerrit.wikimedia.org/r/708194

Change 708194 merged by jenkins-bot:

[mediawiki/services/parsoid@master] Upgrade wikimedia/dodo to 0.2.0

https://gerrit.wikimedia.org/r/708194

Note that the mwext-php72-phan-docker job doesn't do this anomalous composer install, as shown in https://integration.wikimedia.org/ci/job/mwext-php72-phan-docker/125848/ .

Change 708791 had a related patch set uploaded (by C. Scott Ananian; author: C. Scott Ananian):

[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.14.0-a11

https://gerrit.wikimedia.org/r/708791

Change 708791 merged by jenkins-bot:

[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.14.0-a11

https://gerrit.wikimedia.org/r/708791

Change 588479 merged by jenkins-bot:

[mediawiki/tools/phan@master] Exclude stubs from .phan configurations in libraries

https://gerrit.wikimedia.org/r/588479

Dumping my own understanding of the root cause here (corrections welcome) -- as I understand it, mediawiki-vendor is *not* supposed to include dev dependencies (require-dev), it is supposed to include *only* the libraries needed for production. However, this separation gets slippery, as there are a bunch of non-production libraries included in mediawiki-vendor because deployers find it makes their life easier to have some basic testing support available on the production machines -- and I think there are some other weird exceptions as well.

As a result, even though it "looks like" mediawiki-vendor has dev dependencies, in actual practice CI must always run 'composer install' on certain targets in order to ensure that the dev dependencies are actually installed. This is Bad, but the current plan is (as I understand it) to fix this by properly encapsulating CI runs so that composer install is "safe" rather than (eg) either including all dev dependencies in mediawiki-vendor or splitting the package so that (say) mediawiki-vendor-dev is installed for CI instead of live-running composer install.

I'm just going to mention composer update here so it show up in my search (since I always forget) and mention https://gerrit.wikimedia.org/r/c/mediawiki/services/parsoid/+/612431 as a patch which is blocked on the resolution of this bug.

T344032: CI failing on wmf/1.41.0-wmf.20 due to Parsoid\Config\SiteConfigTest is another instance of this issue, with a little wrinkle. In addition to composer install taking the wrong versions in the window between our tagging a new Parsoid release and merging the corresponding patch in mediawiki-vendor, in T344032 as I understand it the wmf.21 train was delayed, and so patches were backported to wmf.20 even after Parsoid had tagged and released for wmf.21. As a result, the composer install step in the wmf.20 CI ended up using the wmf.21 version of Parsoid instead of the proper version from mediawiki-vendor.

I don't think core CI's should ever be running composer install blindly, it should be using the versions pinned in mediawiki-vendor.

This is an unconventional way to use Composer. It's not an accident that we have tests that run composer install. We've been running all tests twice, once for the vendor repo and once with composer, since March 2015 in a set of commits linked to T90303. I always thought the point of it was to ensure that there is no divergence between the vendor repo and composer.json.

@Krinkle's comment at T88436#1013289, which motivated the change, does suggest only testing deployment branches against vendor. But you seem to be saying that CI should never run composer install, even against master. If so, can you please spell out why you want to do this?

Also, can someone please remind me why we don't commit composer.lock? The approved June 2014 RFC said that composer.lock would be committed. But the March 2014 commit by the same author which introduced the first library dependency did not commit composer.lock, and it's never been committed since. I can't find any discussion of this detail. I'd expect it to be documented at mw:Composer but there's nothing about it there.

Whatever the reason is, does it also apply to deployment branches?

I think composer install should work in deployment branches, producing a usable set of packages. Otherwise, how are we meant to cherry-pick an update into the deployment branch of the vendor repo? The current procedure is to modify composer.json and re-run composer, right? If that gets you a random broken set of packages, how are you meant to respond to an urgent security issue in a dependency?

I think composer install should also work in major release branches, for the convenience of users.

These goals would be satisfied by either using semantic versioning, or by responding to a breaking change in a library by modifying composer.json in all active deployment branches and all release branches that are in the support period.

We ran into this *again* today. I accidentally tagged a v0.19.0 of parsoid while trying to make the v0.18.0 tag for MW 1.41. This immediately broke CI for everyone because of this job doing composer install against MW core's composer.json which says ^0.19.0-a5 and of course 0.19.0 is greater than -a5.

Developing a tightly coupled library alongside core with an ^ version constraint on master that we upgrade to a precise one when we release a deployment branch has worked fine for us for a number of years now, *except for this issue*.

I'm fine with having composer install work on major release branches, and having CI test against that. We change from upcarat dependencies to precise dependencies for the release branch.

I'm not sure what the issue you're raising with composer.lock is -- mediawiki-vendor does include a composer.lock. This bug is about CI ignoring the composer.lock (and the rest of the contents) of mediawiki-vendor in favor of doing composer install over the network.

Our weekly release process is that we test a specific git hash of parsoid, then tag it, then create a mediawiki-vendor patch which "releases" that tagged version into the mediawiki ecosystem (and onto the train). However, because of this issue, the tagged version is *immediately live in CI* as soon as we tag it, even before we make the mediawiki-vendor patch much less review and merge it. This also causes periodic issues when the CI run on mediawiki-vendor patch picks up a legit incompatibility -- but by the time it finds this we've broken CI for everyone because the tagged version is already live. Luckily our tagged releases aren't broken very often. (As it turns out, I already wrote about this above at T287419#9086980.)

I can see scenarios where composer install makes sense for master development .. i.e. when you want to use a new version of a package and use code available there which would fail if you tested it against vendor versions.

As for Parsoid, I think maybe we should not use "^" for parsoid tags in core's composer.json and instead use a specific pinned version and update it whenever we update vendor/ (after that has merged).

Change 972480 had a related patch set uploaded (by Subramanya Sastry; author: Subramanya Sastry):

[mediawiki/core@master] Pin Parsoid to a specific version

https://gerrit.wikimedia.org/r/972480

I pushed that patch above for discussion.

I much prefer the version where we use composer install only on release branches. More discussion in the https://gerrit.wikimedia.org/r/972480 patch.

Also, to @tstarling 's question: if CI would use the composer.lock from mediawiki-vendor at least, then this would also avoid the bug while also potentially allowing the installation of "additional packages" for testing which are not included in mediawiki-vendor. That was the reason I was originally given for CI including a composer install step -- that certain web-based tests required additional packages which they didn't want to have installed on the production machines, as everything in mediawiki-vendor is also installed on production. The other argument (about wanting to make sure that our mediawiki-vendor was synchronized with core's composer.json) is brand new to me -- it also seems like we could easily add a specific check for that specific issue, rather than rely on a composer install over the network to test it by proxy.

Change 972480 abandoned by Subramanya Sastry:

[mediawiki/core@master] Pin Parsoid to a specific version

Reason:

Scott proposed that we start the version pinning with the next deploy with the script in https://gerrit.wikimedia.org/r/c/mediawiki/services/parsoid/+/974291

Since we seem to have arrived at an agreement to go the precise pinning route with the discussion on this patch, the job of this patch is done. As such, I am going to abandon this patch

https://gerrit.wikimedia.org/r/972480

Since we have gone the way of pinning Parsoid version in core, this is no longer an issue and am closing this as declined.