Page MenuHomePhabricator

Wikibase CI, phan & mediawiki-vendor issues.
Open, HighPublic

Description

This happened today, and thus CI for master of Wikibase is broken, as can bee seen in https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Wikibase/+/678227

proposed solution
If phan jobs would be changed to do a composer install instead of use mediawiki-vendor, then we would be able to flip the dependency chain?
wikibase CI would pass (well the patch would also be different including the needed fixes to satisfy static analysis), and then mediawiki-vendor version bump could depend on the wikibase change.

possible issues?

This potentially has an issue though being that the inverse dependency chain is suggested in the docs

Note that you MUST pair patches changing versions of libraries used by MediaWiki itself with ones for the "core" repo. Specifically, the patch in mediawiki/core must have a Depends-On footer to the patch in mediawiki/vendor.

This is another case similar to T178137#6921107 where our CI will happily let something through to production with bugs in it because of the convoluted CI setup around mediawiki-vendor.
The fact that this was caught was again chance and luck rather than having a process that avoid this kind of thing.
This also relates to T179663, as the situation described above again covers the dependencies of Wikibase being ahead of those in mediawiki-vendor for some period of time, and not automatically being flagged up, again leading to broken situations that will happily be deployed.
I stand by what I have said multiple times being that the mediawiki-vendor workflows and the CI tired to development should be separated to avoid all of this complexity and pain that has become a recurring theme.

Event Timeline

Addshore moved this task from Inbox to Investigate & Discuss on the wdwb-tech board.
Addshore added projects: phan, MediaWiki-Vendor.

I stand by what I have said multiple times being that the mediawiki-vendor workflows and the CI tired to development should be separated to avoid all of this complexity and pain that has become a recurring theme.

They are? The non-vendor CI jobs run in parallel and will tell you when e.g. you have unsatisfiable or clashing dependencies or whatever.

They are? The non-vendor CI jobs run in parallel and will tell you when e.g. you have unsatisfiable or clashing dependencies or whatever.

Seemingly not for phan for Wikibase or core?

So it sounds like we should just add another job for all places that currently run phan?
So that we have 1 that uses mediawiki-vendor and one that does not?

Re visiting this comment again.

I stand by what I have said multiple times being that the mediawiki-vendor workflows and the CI tired to development should be separated to avoid all of this complexity and pain that has become a recurring theme.

They are? The non-vendor CI jobs run in parallel and will tell you when e.g. you have unsatisfiable or clashing dependencies or whatever.

So yes this does mean that the the mediawiki-vendor workflows for deployment etc & the non mediawiki-vendor workflows which is what devs mainly care about are bound very closely to the development workflows.
If that's how we are to continue then yes I think we need 2 jobs for every PHP CI job (bar perhaps codesniffer), one that uses mediawiki-vendor and one that does a composer install, in order to avoid things breaking (like in this case).

As said in https://phabricator.wikimedia.org/T178137#6922193 the more decoupled solution would be to actually do this as part of some deployment process rather than in the development workflow.
This would be a step after development, but before things land in production.
This would likely be around branch cut etc currently?

They are? The non-vendor CI jobs run in parallel and will tell you when e.g. you have unsatisfiable or clashing dependencies or whatever.

Seemingly not for phan for Wikibase or core?

Ah, yes, sorry, we don't support phan for anything other than master-with-vendor right now (your hacks to try to make it work for Wikibase notwithstanding).

The plan is to get phan fast enough that we can merge it into the regular composer test job in all repos, at which point it'll be covered everywhere equally. See T226945: Decide on future of running Phan tests on release branches for more details.

[…]

I was surprised to hear this because for core we have always been able to do it the opposite way. Patches to mediawiki-vendor that check in new or changed core dependencies, have no Depends-On instruction, and their tests pass fine. On the mediawiki-core side, there is (naturally) always a Depends-On so that the core patch can pass its unit tests wit the new code (both for new code it introdues, as well as to satisy the Installer, which ensures vendor exists and matches composer.json requirements).

For example change 672487, and change 674386.

The mediawiki-vendor repo has significantly fewer tests and that's basically the trade-off we made to make these patches easier to work with. The trade-off involves two important social expectations: 1) that you never ever merge stuff here unless it's been tested with Depends-On from a +2'ed patch in core or a gated extension repository, and 2) that the update is at least backwards-compatible for one commit while this stack lands. This first point is non-obvious and has at times led to a broken CI. This second one is more obvious in terms of discovery since tests will fail if it isn't followed. For core updates this is usually unsurprising in my experience since we have a stability policy and if a lib change were to be breaking in a way that doesn't merely remove already-deprecated features, we'd have no way to upgrade anyhow since we can't atomically update callers in multiple production repos and upgrade a library all at the same time. You first prepare any callers to be forward-compat, and then just do the upgrade smooohly as a (hopefully) unnotable event.

So the current state would be:

  • Make mediawiki & extensions compatible with 2 versions of libraries
    • Tests that use mediawiki-vendor will run using the old versions of the libraries
    • Tests that do composer install will probably run using the newer version of the libraries
    • Phan right now will always run using the older versions of the libraries (which can make such changes harder)
    • I don't think there is anything in CI to actually make sure that the library changes in composer.json files are compatible with what is in mediawiki-vendor (see T179663)
  • Bump mediawiki-vendor versions, once everywhere can use the newer versions
    • I don't think there are any checks to see if what is in the mediawiki-vendor composer.json actually lines up with what the other packages ask for (see T179663)
    • At this point Phan (and possibly other tests) can start to fail (as happened before this ticket was created) leading to broken main branches
  • Fix the main branches / remove the back compat / remove the lower composer dependencies
    • Everything is okay again

I guess in this case the depends-on patches would be in the extensions etc on the bump to mediawiki-vendor?
Though it's hard to tell if they are needed as CI would be green at this point?

Ah, yes, sorry, we don't support phan for anything other than master-with-vendor right now (your hacks to try to make it work for Wikibase notwithstanding).

Sounds like it shouldn't be too hard to get phan running both using mediawiki-vendor and a composer install
Though I personally would still be in favour of this mediawiki-vendor repository being further away from the development workflow and closer to the deployment workflow.

The plan is to get phan fast enough that we can merge it into the regular composer test job in all repos, at which point it'll be covered everywhere equally. See T226945: Decide on future of running Phan tests on release branches for more details.

I guess (but didn't check) that this would also solve the problem as jobs already exist doing composer test for mediawiki-vendor & composer install

What ever happened to us offloading some of our CI workloads onto other cloudy providers?
That would make everything go much faster with very little effort

[2021-01-22 20:25:58] <addshore> James_F: mildly interesting, i ran a wmf-quibble-selenium-php72-docker job that had run on jenkins on github actions just to see what the test execution time was, 23 mins on jenkins, 16 mins on Github Actions https://usercontent.irccloud-cdn.com/file/0HH0fiSK/image.png

So the current state would be:

  • Make mediawiki & extensions compatible with 2 versions of libraries
    • Tests that use mediawiki-vendor will run using the old versions of the libraries
    • Tests that do composer install will probably run using the newer version of the libraries
    • Phan right now will always run using the older versions of the libraries (which can make such changes harder)
    • I don't think there is anything in CI to actually make sure that the library changes in composer.json files are compatible with what is in mediawiki-vendor (see T179663)
  • Bump mediawiki-vendor versions, once everywhere can use the newer versions
    • I don't think there are any checks to see if what is in the mediawiki-vendor composer.json actually lines up with what the other packages ask for (see T179663)
    • At this point Phan (and possibly other tests) can start to fail (as happened before this ticket was created) leading to broken main branches
  • Fix the main branches / remove the back compat / remove the lower composer dependencies
    • Everything is okay again

I guess in this case the depends-on patches would be in the extensions etc on the bump to mediawiki-vendor?
Though it's hard to tell if they are needed as CI would be green at this point?

Well, no.

It's exceptionally rare, and not really supported, that multiple extensions have the same explicit dependency on something in vendor; certainly.

You're making it far more complex than it should be, mostly because Wikibase is doing a bunch of things it's not meant to.

You're meant to use MW's extension dependency system to depend on the extension that provides the composer dependency (in your case, Wikibase).

Supporting multiple versions of composer libraries is not an expected outcome, except for people trying to support multiple versions of MediaWiki in the master branch (which is very heavily discourage and not actually supported by CI) or temporarily for transitional issues (like our support in core for two versions of dbal, because there's no version of dbal that supports both PHP 7.2 and 8.0).

You're meant to:

A) Write a patch in vendor that bumps the composer dependency.
B) Write a patch in the extension that bumps the composer dependency and makes things work, dependent on (A).
C) Merge them together.

https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Wikibase/+/676077 should have depended on https://gerrit.wikimedia.org/r/c/mediawiki/vendor/+/676007 (not the other way around); that way, you would have been actually testing in CI the world that was going to happen once things merged. Of course, because WikibaseLexeme is not in the gate, at the end of the Wikibase patch landing WikibaseLexeme would have been broken, but merging https://gerrit.wikimedia.org/r/c/mediawiki/extensions/WikibaseLexeme/+/676419 would have then resulted immediately in a functioning world for everyone. (Had WikibaseLexeme been in the gate, you'd have needed to force-merge the Wikibase patch, which would have taken down CI for ~ an hour for everyone else, so…).

I'd strongly recommend that you drop all of the wikibase composer dependencies from WikibaseLexeme etc.. They're all pulled in by Wikibase itself, and you actually care about those versions, as otherwise the code in Wikibase you're depending on will die.

The plan is to get phan fast enough that we can merge it into the regular composer test job in all repos, at which point it'll be covered everywhere equally. See T226945: Decide on future of running Phan tests on release branches for more details.

I guess (but didn't check) that this would also solve the problem as jobs already exist doing composer test for mediawiki-vendor & composer install

It would.

What ever happened to us offloading some of our CI workloads onto other cloudy providers?
That would make everything go much faster with very little effort

[2021-01-22 20:25:58] <addshore> James_F: mildly interesting, i ran a wmf-quibble-selenium-php72-docker job that had run on jenkins on github actions just to see what the test execution time was, 23 mins on jenkins, 16 mins on Github Actions https://usercontent.irccloud-cdn.com/file/0HH0fiSK/image.png

The team made the decision to migrate to replace the current 'legacy' CI system (to GitLab, as you may have heard). Most of the effort is going into that, AIUI.

The problem with the legacy CI is not infrastructure/speed, it's complexity and centralised control. Adding more centralised complexity will not make things better.

You're meant to use MW's extension dependency system to depend on the extension that provides the composer dependency (in your case, Wikibase).

Sounds like we need to update https://www.mediawiki.org/wiki/Manual:External_libraries
But it does sound like this should be a mainly workable solution that would remove some of this complexity (for now).