Maniphest T193824

Determine a standard way of installing MediaWiki lib/extension dependencies within containers
Open, LowPublic
Actions

Assigned To

None

Authored By

	dduvall
	May 4 2018, 1:12 AM

Description

With the near-term goal of deploying minimal MediaWiki containers alongside services in the release pipeline for the purpose of e2e testing services, and the eventual goal of testing and deploying MediaWiki and its extensions via the pipeline to production, we'll need to somehow have Blubber install MediaWiki dependencies (both libraries and extensions) in some sane manner.

We could potentially build support directly into Blubber but this coupled approach seems wrong given Blubber's current design (stateless and preferring delegation to package managers over intimate knowledge of resident applications), and it also would seem like a missed opportunity to simplify the installation of MediaWiki extensions through something standard like composer.

Some ideas so far:

Try to wrangler composer into natively installing extensions in addition to aggregate library dependencies (through the merge plugin).
Write a composer plugin that understands the extension.json schema and can recursively resolve and install extensions. Note that the extension.json schema does not currently allow you to specify test/dev dependencies.
Write some wrapper tooling that can recursively resolve and install MW extensions and leave library dependencies to composer.

In any case, Blubber would provide configuration to properly invoke the underlying dependency manager(s) much the same way it provides configuration for Node and Python dependency managers (npm and pip respectively).

Details

Subject	Repo	Branch	Lines +/-
Clone requirements from ext dependencies	integration/quibble	master	+315 -11
Add checkDependencies.php	mediawiki/core	master	+203 -0
registration: Add development requirements to extension.json	mediawiki/core	master	+231 -6

Customize query in gerrit

Related Objects
Search...

Status	Assigned	Task
Resolved	None	T170453 FY2017/18 Program 6: Streamlined Service delivery
Invalid	None	T170480 FY2017/18 Program 6 - Outcome 2: Developers are able to develop and test their applications through a unified pipeline towards production deployment.
Invalid	None	T170481 FY2017/18 Program 6 - Outcome 2 - Objective 2: Set up a continuous integration and deployment pipeline
Resolved	thcipriani	T193777 FY2017/18-Q4: Prove viability of testing staged service containers alongside MediaWiki extension containers
Declined	None	T187105 Get MediaWiki running in Docker with Blubber
Open	None	T193824 Determine a standard way of installing MediaWiki lib/extension dependencies within containers

Event Timeline

dduvall created this task.May 4 2018, 1:12 AM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMay 4 2018, 1:12 AM

dduvall triaged this task as Medium priority.May 4 2018, 1:12 AM

dduvall added a parent task: T187105: Get MediaWiki running in Docker with Blubber.

dduvall mentioned this in T166956: Cannot use Composer's CLI to manage a project's dependencies.May 19 2018, 11:09 AM

dduvall updated the task description. (Show Details)May 19 2018, 11:11 AM

I don't have a full picture of what the deployment container stuff is going to look like, so please correct me if my assumptions are wrong.

and it also would seem like a missed opportunity to simplify the installation of MediaWiki extensions through something standard like composer.

My main suggestion here would be not to do that. As someone who has spent a decent amount of time trying to simplify the installation of MediaWiki extensions, (and probably spent just as much time trying to make composer work), I don't think composer fits the model of what we need out of a manager for MediaWiki extensions. Whether it be security, privacy, naming, version management, branches, etc. It could probably be made to work, and people have. But that's also sacrificed a lot, notably breaking the installer. And just generally, dependency managers and package managers are hard problems. My current plan is to take the input from the current RfC, turn it into a plan that addresses most needs, and then build something that works, has TechCom approval, and can be endorsed as "official".

And, I don't think composer even gives you what you want here - since composer really just downloads extensions. You still need to manually come up with the dependency map inside of MediaWiki (assuming you want realistic e2e tests and aren't planning to enable every extension (that won't work for other reasons)).

The "standard" way of installing MediaWiki extensions is to get library dependencies from mediawiki/vendor, and then use Git to install the version of the extension you want. You'll have to manage extension dependencies manually, but given how long we've been doing that for, I don't think it's an extremely difficult burden. composer currently has no support for integrity verification, which is why we don't use composer-merge-plugin in production. Additionally, we've started to post process the composer output before deploying it (T194646).

Sidenote: We currently don't actually have true dependency data (for what you're trying to do I think). We have hard dependencies in extension.json, and we have test dependencies in CI, but those test dependencies also include dependencies that are just needed for static analysis. But I think what you want is practical dependencies, which is what extensions you actually need to have installed to make the extension useful (closest comparison is Recommends in the Debian world).

zeljkofilipin awarded a token.May 28 2018, 9:07 AM

zeljkofilipin subscribed.

Jdforrester-WMF subscribed.May 29 2018, 3:21 PM

Thanks, @Legoktm. We seem to be working at an intersection of different efforts and use cases—and a glaring absence of sufficient tooling—so there are bound to be conflicting requirements put forth as we continue work on the CD Pipeline. Your experience with composer, particularly that which relates to MediaWiki extension management, is greatly appreciated.

In T193824#4233130, @Legoktm wrote:

I don't have a full picture of what the deployment container stuff is going to look like, so please correct me if my assumptions are wrong.

Hopefully the tasks related to this one can provide more context for our requirements for extension/skin dependencies for the pipeline, particularly T193777: FY2017/18-Q4: Prove viability of testing staged service containers alongside MediaWiki extension containers.

and it also would seem like a missed opportunity to simplify the installation of MediaWiki extensions through something standard like composer.

My main suggestion here would be not to do that. As someone who has spent a decent amount of time trying to simplify the installation of MediaWiki extensions, (and probably spent just as much time trying to make composer work), I don't think composer fits the model of what we need out of a manager for MediaWiki extensions. Whether it be security, privacy, naming, version management, branches, etc. It could probably be made to work, and people have. But that's also sacrificed a lot, notably breaking the installer. And just generally, dependency managers and package managers are hard problems. My current plan is to take the input from the current RfC, turn it into a plan that addresses most needs, and then build something that works, has TechCom approval, and can be endorsed as "official".

A couple of us from Release Engineering attended the RfC hackathon session and made a suggestion for the kind of thing that might help push our work forward.

Is there a rough timeline for work from that RfC? Regardless of what we implement or adopt for the pipeline in the coming quarters, it's great to know there will be a comprehensive and official solution to extension management that we can eventually switch over to. That said, we'd rather not rely on unplanned work in setting our own goals for the pipeline.

It was probably a mistake to include the "simplify the installation of MediaWiki extensions" bit in the task's description, as it's not a chief goal of Release Engineering or a concern of the pipeline. It was stated as a purported side benefit, but that statement didn't incorporate the realities and history of work on MW extension management that you have rightly pointed out.

And, I don't think composer even gives you what you want here - since composer really just downloads extensions. You still need to manually come up with the dependency map inside of MediaWiki (assuming you want realistic e2e tests and aren't planning to enable every extension (that won't work for other reasons)).

Composer's ability to recursively resolve a complete dependency graph certainly does seem limited. The experiments I was running used its vcs repo support to resolve/install extensions from our git repos, but then I noticed that Composer can't recursively resolve dependencies from repos so that approach seems like a bust.

My next thought was that it might be possible to build in support for recursion into repos, but that brings me to a question for you: Was upstream responsive to feature requests or patches?

The "standard" way of installing MediaWiki extensions is to get library dependencies from mediawiki/vendor, and then use Git to install the version of the extension you want. You'll have to manage extension dependencies manually, but given how long we've been doing that for, I don't think it's an extremely difficult burden.

Right. It doesn't seem like a difficult burden when done manually. However a manual process of any complexity is incompatible with the pipeline as we currently envision it.

To satisfy our requirement for automated integration testing of a change to a MediaWiki extension alongside its complementary or required services, we think we need a system that can take only a patchset to core or to an extension (e.g. VE) as input, produce a working container image (e.g. of core + VE + its lib and hard extension/skin dependencies), deploy that image to an isolated k8s namespace along with services (e.g. Parsoid, RESTBase), and perform limited but broad e2e testing. The current submodule and vendor-repo approach does not fit well with this design because:

extension management via submodules is only a thing for core, and only for manually maintained release branches, not master
the external and tightly coupled mapping in integration/config to manage test-level dependencies is a pattern we'd rather avoid as it prohibits the repo-authoritative model mentioned above
the vendor repo is larger than necessary and works against keeping image sizes small, and it also prohibits the repo-authoritative model mentioned above

Sidenote: We currently don't actually have true dependency data (for what you're trying to do I think). We have hard dependencies in extension.json, and we have test dependencies in CI, but those test dependencies also include dependencies that are just needed for static analysis. But I think what you want is practical dependencies, which is what extensions you actually need to have installed to make the extension useful (closest comparison is Recommends in the Debian world).

During our previous offsite, Release Engineering also (briefly) looked at the extension.json schema (mentioned in the task description) and we noticed that same lack of test-/dev-level dependencies and suggests/recommends functionality. Do you see that schema being extended at any point? Do you have any thoughts/advice on ideas 2 and 3 mentioned in the task, either building something integrated with composer that understands extension.json or a completely separate tool?

thcipriani mentioned this in T196414: Build Math extension container on Postmerge.Jun 4 2018, 9:35 PM

• Vvjjkkii renamed this task from Determine a standard way of installing MediaWiki lib/extension dependencies within containers to bndaaaaaaa.Jul 1 2018, 1:12 AM

• Vvjjkkii raised the priority of this task from Medium to High.

• Vvjjkkii added projects: CheckUser, Connected-Open-Heritage-Batch-uploads (RAÄ-KMB_1_2017-02), Tamil-Sites, Gamepress, Hashtags, Jade, KartoEditor, Language-2018-Apr-June, New-Editor-Experiences, Mail, TCB-Team (now WMDE-TechWish).

• Vvjjkkii updated the task description. (Show Details)

• Vvjjkkii removed a subscriber: Aklapper.

thcipriani renamed this task from bndaaaaaaa to Determine a standard way of installing MediaWiki lib/extension dependencies within containers.Jul 1 2018, 6:41 PM

thcipriani updated the task description. (Show Details)

thcipriani removed projects: TCB-Team (now WMDE-TechWish), Mail, New-Editor-Experiences, Language-2018-Apr-June, KartoEditor, Jade, Hashtags, Gamepress, Tamil-Sites, Connected-Open-Heritage-Batch-uploads (RAÄ-KMB_1_2017-02), CheckUser.Jul 1 2018, 6:42 PM

thcipriani added a subscriber: Aklapper.

CommunityTechBot lowered the priority of this task from High to Medium.Jul 5 2018, 6:36 PM

thcipriani lowered the priority of this task from Medium to Low.Jul 16 2018, 7:27 PM

thcipriani moved this task from Backlog to Blocked (externally) on the Release-Engineering-Team (Kanban) board.

thcipriani moved this task from Backlog to Migration on the Release Pipeline board.

brennen subscribed.Feb 25 2019, 5:49 PM

We have filled this task following Release-Engineering-Team May 2018 offsite. One of the need is for CI to be able to **optionally* install dependent extensions. The current system is centrally managed in integration/config.git and suffers from several issues:

requires review/merge/deploy from one of the CI maintainer
is not aware of branches

That is expressed on MediaWiki.org Extension_management_2018_feedback#Programatically_install_extension_dependencies:

Currently Wikimedia CI relies on a mapping inside the integration/config repository to determine which extensions depend on other extensions within the context of CI; however, outside of our current testing setup this is not usable. Allowing extensions to define those other extensions on which they depend would be generally useful as well as useful for future development of an extension-testing pipeline. TCipriani (WMF) (talk) 14:27, 19 May 2018 (UTC)

Quoting the 3 points Dan wrote originally:

~~Try to wrangler composer into natively installing extensions in addition to aggregate library dependencies (through the merge plugin).~~
- @Legoktm explained above how we cant/dont want to use composer for that.

Write a composer plugin that understands the extension.json schema and can recursively resolve and install extensions. Note that the extension.json schema does not currently allow you to specify test/dev dependencies.
Write some wrapper tooling that can recursively resolve and install MW extensions and leave library dependencies to composer.

The two other options are about introducing a new wrapper. Either as a composer plugin (2) or some CLI utility (3). I think we can just ship a maintenance utility in mediawiki/core that would take care of that, eventually make it a composer plugin later on if need be.

With the extension registry, developers are able to express the dependency upon another extension (from MediaWiki Manual:Extension_registration) an example would be:

extension.json

{
	"requires": {
		"extensions": {
			"FakeExtension": "*"
		}
	}
}

So when currently CI centrally defines:

zuul/parameter_functions.py

dependencies = {
    '3D': ['MultimediaViewer'],
    'MultimediaViewer': ['BetaFeatures'],
}

When one send a patch to mediawiki/extensions/3D, CI build the list of dependencies and inject to the build something like:

EXT_DEPENDENCIES=mediawiki/extensions/MultimediaViewer\\nmediawiki/extensions/BetaFeatures

Instead we should just clone the 3D extension, then run the new CLI utility which would look at the requirements defined in extension.json:

extension.json

{
	"requires": {
		"extensions": {
			"MultimediaViewer": "*"
		}
	}
}

Clone the required extensions/skins and recursively process. Eg MultimediaViewer would be cloned and would need to have:

extension.json

{
	"requires": {
		"extensions": {
			"BetaFeatures": "*"
		}
	}
}

The devil is that on CI the cloning will have to be done using zuul-cloner in order to checkout the proper branch/patch. And we would probably need to have Quibble to handle it for us since that is the main test runner nowadays.

Hence. My proposal is to have the feature implemented directly into Quibble with a feature flag. In this mode, it will ignore the legacy EXT_DEPENDENCIES environment variable (or even dies if it is present), then clone the repositories that are passed as argument and recursively process each of them:

quibble --process-extension-requirements mediawiki/extensions/3D
INFO: cloning mediawiki/extensions/3D
INFO: checking it out patch/branch from Zuul
INFO: 3D depends on MultimediaViewer

INFO: cloning mediawiki/extensions/MultimediaViewer
INFO: checking it out patch/branch from Zuul
INFO: MultimediaViewer depends on BetaFeatures

INFO: cloning mediawiki/extensions/BetaFeatures
INFO: checking it out patch/branch from Zuul

... run composer merge plugin ...
.. install mediawiki ...

And this way we can phase out the dependencies from zuul/parameter_functions.py. Thoughts?

Change 502286 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/quibble@master] Clone requirements from ext dependencies

https://gerrit.wikimedia.org/r/502286

gerritbot added a project: Patch-For-Review.Apr 8 2019, 10:11 PM

hashar mentioned this in rQUIBBLE9e944a2404c9: Clone requirements from ext dependencies.Apr 8 2019, 10:16 PM

hashar mentioned this in rQUIBBLEa48ee8b3548e: Clone requirements from ext dependencies.Apr 8 2019, 10:27 PM

In general I think that's a good idea. Two things I think we'll need to do:

Add support for require-dev to extension.json, since CI needs to know the real dependencies and the test dependencies.
Provide some script to abstract the reading of extension.json since the format isn't guaranteed long-term.

For #2, here's what I have locally right now:

km@km-pt:/srv/mediawiki/core$ php maintenance/checkDependencies.php --extensions=CodeEditor,Scribunto,ApiFeatureUsage,MassMessage --skins=Timeless --json | json_pp
{
   "skins" : {
      "Timeless" : {
         "status" : "loaded"
      }
   },
   "extensions" : {
      "Elastica" : {
         "status" : "missing",
         "why" : [
            "ApiFeatureUsage"
         ]
      },
      "ApiFeatureUsage" : {
         "why" : [
            "ApiFeatureUsage"
         ],
         "status" : "present"
      },
      "WikiEditor" : {
         "why" : [
            "CodeEditor"
         ],
         "status" : "present"
      },
      "CodeEditor" : {
         "why" : [
            "CodeEditor"
         ],
         "status" : "present"
      },
      "Scribunto" : {
         "status" : "loaded"
      },
      "MassMessage" : {
         "why" : [
            "MassMessage"
         ],
         "message" : "MassMessage is not compatible with the current MediaWiki core (version 1.34.0-alpha), it requires: <= 1.31.0.",
         "status" : "incompatible-core"
      }
   }
}

So the workflow becomes:

Install MediaWiki
Clone $ZUUL_PROJECT
Run php maintenance/checkDependencies.php --extensions=Whatever
If there are any "missing" fields, clone those. Go back to step 3. If there are any "incompatible" fields, give up.
Add wfLoadExtension() to LocalSettings.php
Run whatever tests

Change 503735 had a related patch set uploaded (by Legoktm; owner: Legoktm):
[mediawiki/core@master] [WIP] Add checkDependencies.php

https://gerrit.wikimedia.org/r/503735

CCicalese_WMF assigned this task to Legoktm.Apr 17 2019, 7:38 PM

CCicalese_WMF added projects: Platform Engineering (Extension Management (TEC13)), Platform Team Workboards (Contractor - Doing).

CCicalese_WMF moved this task from Contractor - Doing to Doing on the Platform Team Workboards board.Apr 29 2019, 3:16 PM

CCicalese_WMF edited projects, added Platform Team Workboards (Doing); removed Platform Team Workboards (Contractor - Doing).

Change 511133 had a related patch set uploaded (by Legoktm; owner: Legoktm):
[mediawiki/core@master] [WIP] registration: Add development dependencies to extension.json

https://gerrit.wikimedia.org/r/511133

awight subscribed.May 21 2019, 7:51 AM

Change 511133 merged by jenkins-bot:
[mediawiki/core@master] registration: Add development requirements to extension.json

https://gerrit.wikimedia.org/r/511133

ReleaseTaggerBot added a project: MW-1.34-notes (1.34.0-wmf.7; 2019-05-28).May 25 2019, 2:01 AM

Change 503735 merged by jenkins-bot:
[mediawiki/core@master] Add checkDependencies.php

https://gerrit.wikimedia.org/r/503735

ReleaseTaggerBot edited projects, added MW-1.34-notes (1.34.0-wmf.8; 2019-06-04); removed MW-1.34-notes (1.34.0-wmf.7; 2019-05-28).May 29 2019, 8:00 PM

In T193824#5032352, @hashar wrote:

We have filled this task following Release-Engineering-Team May 2018 offsite. One of the need is for CI to be able to **optionally* install dependent extensions. The current system is centrally managed in integration/config.git and suffers from several issues:

requires review/merge/deploy from one of the CI maintainer

is not aware of branches

My motivation for filing this task was to solve MediaWiki dependency resolution for the purpose of building MediaWiki images via the deployment pipeline. Improving dependency resolution for our current CI jobs is a laudable goal as well, but slightly orthogonal.

Quoting the 3 points Dan wrote originally:

~~Try to wrangler composer into natively installing extensions in addition to aggregate library dependencies (through the merge plugin).~~

@Legoktm explained above how we cant/dont want to use composer for that.

Write a composer plugin that understands the extension.json schema and can recursively resolve and install extensions. Note that the extension.json schema does not currently allow you to specify test/dev dependencies.

Write some wrapper tooling that can recursively resolve and install MW extensions and leave library dependencies to composer.

The devil is that on CI the cloning will have to be done using zuul-cloner in order to checkout the proper branch/patch. And we would probably need to have Quibble to handle it for us since that is the main test runner nowadays.

@hashar, can you elaborate on why Quibble is the right tool for this? In my mind, dependency resolution and installation is a well-scoped problem and a test runner doesn't seem like the right place for it. Again, I'm thinking in terms of the Deployment Pipeline and image building, not running tests via our current CI jobs.

In T193824#5109607, @Legoktm wrote:

In general I think that's a good idea. Two things I think we'll need to do:

Add support for require-dev to extension.json, since CI needs to know the real dependencies and the test dependencies.

Thanks so much, @Legoktm, for introducing this into the extension.json schema. I think whichever way we go with tooling, this is such a central piece to the overall puzzle.

Provide some script to abstract the reading of extension.json since the format isn't guaranteed long-term.

Can you say more about that? Is it the entire specification that's not guaranteed, or specific fields that aren't guaranteed version to version? Do you see the "requires" fields going away in future versions of the schema?

So the workflow becomes:

Install MediaWiki

Clone $ZUUL_PROJECT

Run php maintenance/checkDependencies.php --extensions=Whatever

If there are any "missing" fields, clone those. Go back to step 3. If there are any "incompatible" fields, give up.

Add wfLoadExtension() to LocalSettings.php

Run whatever tests

This workflow is a huge improvement over what we currently do for existing CI jobs. I have some concerns about it for image building for a couple of reasons.

It's not a full solver. Determining whether the version constraints for the current working copies of MediaWiki/extensions/skins are satisfied is a really useful sanity check for whether to proceed with further testing or not. It doesn't facilitate getting from an unresolved state to a resolved state of specific dependencies, which is crucial for deterministic image building.
The "clone those" part of step 4 isn't clear to me. Do we assume that we can always clone the master branch (or a release branch if that's what the current patch is destined for) of each missing extensions/skins? I suppose this is what we currently do in CI, but it's likely to be prone to higher rates of failure with the newly introduced check.
The repeat and give up parts concern me. What is the recourse after such failure? Will it be clear to the developer how to resolve incompatible versions? If the tool doesn't fully solve for optimal dependencies, how can it do any better during the next run but perform the same checks and likely fail? Will changing version specifications to fix such failure cause unexpected failure elsewhere?

As we move closer to giving control to individual teams and developers over how their code will be built, packaged, and deployed via the Deployment Pipeline, I think it will be really important for the supporting toolchain to produce deterministic results and produce actionable messages in the case of failure. In my mind, a complete solver is needed.

Now that we have the ability to fully express runtime dependencies as well as development dependencies for skins and extensions—which again, I think was the most important piece of the puzzle—we'll need something can take the version specs under "requires" for a given extension, and map that to a set of installable artifacts (e.g. git remote and ref) that would best satisfy the specs. I have some really rough thoughts on that, which probably demand a lot of scrutiny. Again, my ideas are informed mostly by the Deployment Pipeline work and not meant to be general purpose (for third parties, etc.). Regardless, please tear it apart. :)

Like @Legoktm mentioned already, now that we have "requires-dev" in extension.json schema, we can populate that for each extension/skin based on what's currently in integration/config/zuul/parameter_functions.py.
Determine a service (or develop one) that can be seeded with extension.json "requires" and "requires-dev" version specifications and aggregate that to an efficient (graph) store for later querying. What to store and how to store it is not entirely clear; This is a hard problem, but there are prior works to draw from. (Fortunately, this also seems like a really fun problem.)
Continue to seed the service each time an extension/skin/core patch is merged, keeping version-to-git-ref mappings and version (in)compatibility sets up to date. It's very likely something like this would have to be periodically rebuilt, which is expensive but important. The extension.json files should be the authoritative source of version compatibility information.
Implement a service endpoint that can fully solve a dependency graph from a given set of version constraints, returning a mapping of extension/skin name to clone-able git remote/ref that best satisfies the constraints.
Develop CLI tooling that can query said service, submitting the version constraints from a local extension.json file, getting back a mapping of extension/skin name to git remote/ref, and delegate to concurrent git clone commands that perform installation.

In T193824#5227012, @dduvall wrote:

My motivation for filing this task was to solve MediaWiki dependency resolution for the purpose of building MediaWiki images via the deployment pipeline. Improving dependency resolution for our current CI jobs is a laudable goal as well, but slightly orthogonal.

Building MediaWiki images or preparing the source code for CI both require dependency resolution of extensions/skins? Additionally the deployment pipeline will replace the current CI jobs / Quibble.
The ultimate outputs are different (Docker image versus a flat tree), but that is the same problem really: fetch extensions and their dependencies recursively.

Quoting the 3 points Dan wrote originally:

~~Try to wrangler composer into natively installing extensions in addition to aggregate library dependencies (through the merge plugin).~~

@Legoktm explained above how we cant/dont want to use composer for that.

Write a composer plugin that understands the extension.json schema and can recursively resolve and install extensions. Note that the extension.json schema does not currently allow you to specify test/dev dependencies.

Write some wrapper tooling that can recursively resolve and install MW extensions and leave library dependencies to composer.

The devil is that on CI the cloning will have to be done using zuul-cloner in order to checkout the proper branch/patch. And we would probably need to have Quibble to handle it for us since that is the main test runner nowadays.

@hashar, can you elaborate on why Quibble is the right tool for this? In my mind, dependency resolution and installation is a well-scoped problem and a test runner doesn't seem like the right place for it. Again, I'm thinking in terms of the Deployment Pipeline and image building, not running tests via our current CI jobs.

Quibble does more than just running test, it also has all the logic to prepare the source tree to be tested. Be it cloning repositories or installing dependencies from composer/npm. Surely its headline is a bit misleading :]

Again the dependency resolution right now is entirely defined in zuul/parameter_functions.py with Quibble cloning the given list of repositories and then checking out the proper patches from Zuul. It felt natural to me to migrate that logic to Quibble which in my mind was a quick win to address the few issues I mentioned earlier (requires CI folks to +2 a change, same set of deps regardless of branch, hard to reproduce locally etc).

Surely Quibble can just be made to invoke an external tooling, being it a composer plugin or a wrapper provided by MediaWiki. I did write a proof of concept for Quibble which process requirements in extension.json, clone/fetch them and repeat until fulfilled. Code can be seen at https://gerrit.wikimedia.org/r/#/c/integration/quibble/+/502286/ , then I have been very boldly asked to stop working on that.

In the end, Quibble will be dished out anyway and entirely replaced by the local development tooling / Deployment pipeline.

Legoktm mentioned this in T225112: New phan dependencies significantly slowed down CI tests.Jun 5 2019, 4:16 PM

WMDE-leszek subscribed.Jun 5 2019, 4:19 PM

(Uh, I forgot to press submit on my comment, sorry.)

In T193824#5227012, @dduvall wrote:

Provide some script to abstract the reading of extension.json since the format isn't guaranteed long-term.

Can you say more about that? Is it the entire specification that's not guaranteed, or specific fields that aren't guaranteed version to version? Do you see the "requires" fields going away in future versions of the schema?

Currently, the only guarantee that we make is that an extension with a extension.json file will just continue to work. Breaking changes are done via bumping manifest_version (currently only done once, with no plans for a v3 right now).

One thing I've tried to tell people is that while any tool should feel free to consume this data, the only authoritative source and implementation of the format is MediaWiki. Most other things (homegrown scripts, Grunt config) generally take shortcuts, which is fine, but not officially supported if for whatever reason we radically changed formats. That's mostly a holdover from when we/I wasn't that confident in the format stability.

All that said, I feel relatively comfortable now declaring the format stable enough, provided they implement some specific checks (e.g. verifying manifest_version). I'd want to write some more detailed documentation about that first though.

This workflow is a huge improvement over what we currently do for existing CI jobs. I have some concerns about it for image building for a couple of reasons.

It's not a full solver. Determining whether the version constraints for the current working copies of MediaWiki/extensions/skins are satisfied is a really useful sanity check for whether to proceed with further testing or not. It doesn't facilitate getting from an unresolved state to a resolved state of specific dependencies, which is crucial for deterministic image building.

The "clone those" part of step 4 isn't clear to me. Do we assume that we can always clone the master branch (or a release branch if that's what the current patch is destined for) of each missing extensions/skins? I suppose this is what we currently do in CI, but it's likely to be prone to higher rates of failure with the newly introduced check.

The repeat and give up parts concern me. What is the recourse after such failure? Will it be clear to the developer how to resolve incompatible versions? If the tool doesn't fully solve for optimal dependencies, how can it do any better during the next run but perform the same checks and likely fail? Will changing version specifications to fix such failure cause unexpected failure elsewhere?

As we move closer to giving control to individual teams and developers over how their code will be built, packaged, and deployed via the Deployment Pipeline, I think it will be really important for the supporting toolchain to produce deterministic results and produce actionable messages in the case of failure. In my mind, a complete solver is needed.

I don't think we want or need a full solver really. First, we don't actually do extension "releases", we just deploy master (effectively, yes there's a technical difference with wmf/ branches). If we theoretically did have releases, I don't think we want extension A saying it's only compatible with extension B "~1.0.0" holding back some bug fix in B's "1.1.1" release by pulling in an older release of B. Instead, we'd rather just fail, and update A to be compatible with the new B.

It's also possible I'm not grasping how things are expected to be in the Deployment Pipeline? Are we expecting to start tagging extension releases? So far we've just been using recursion in ExtensionRegistry and CI to mostly successful outcomes.

Re #3, I think the error messages are good enough for now, they say X isn't compatible with Y constraint or something like that. It's still up to the developer to figure out whether X or Y is what should be fixed.

Now that we have the ability to fully express runtime dependencies as well as development dependencies for skins and extensions—which again, I think was the most important piece of the puzzle—we'll need something can take the version specs under "requires" for a given extension, and map that to a set of installable artifacts (e.g. git remote and ref) that would best satisfy the specs. I have some really rough thoughts on that, which probably demand a lot of scrutiny. Again, my ideas are informed mostly by the Deployment Pipeline work and not meant to be general purpose (for third parties, etc.). Regardless, please tear it apart. :)

Like @Legoktm mentioned already, now that we have "requires-dev" in extension.json schema, we can populate that for each extension/skin based on what's currently in integration/config/zuul/parameter_functions.py.

Determine a service (or develop one) that can be seeded with extension.json "requires" and "requires-dev" version specifications and aggregate that to an efficient (graph) store for later querying. What to store and how to store it is not entirely clear; This is a hard problem, but there are prior works to draw from. (Fortunately, this also seems like a really fun problem.)

Continue to seed the service each time an extension/skin/core patch is merged, keeping version-to-git-ref mappings and version (in)compatibility sets up to date. It's very likely something like this would have to be periodically rebuilt, which is expensive but important. The extension.json files should be the authoritative source of version compatibility information.

Implement a service endpoint that can fully solve a dependency graph from a given set of version constraints, returning a mapping of extension/skin name to clone-able git remote/ref that best satisfies the constraints.

Develop CLI tooling that can query said service, submitting the version constraints from a local extension.json file, getting back a mapping of extension/skin name to git remote/ref, and delegate to concurrent git clone commands that perform installation.

https://www.mediawiki.org/wiki/Extension:ExtensionDistributor/tardist was my initial draft at such a service (it predates dependencies so it's pretty out of date). I think once we decide whether we need a full solver or not will impact what this service ultimately looks like.

I'd like to better understand the need, and the requirements. In particular: do we really need or even want automatic resolution of extension dependencies?

I don't know much about Blubber, but in my mind, I'd imagine something like this:

the blubber files has a list of extensions to install (perhaps including a repo URI for each)
clone MediaWiki
clone each extension
read extension.json to verify dependencies/requirements are met.
generate a composer.local.json from the list of extensions (and skins)
run composer install

The key question is where the script that reads and checks extension.json should live. Blubber seems the wrong place, but we can't run mediawiki at this stage either. So perhaps it could be in a library that gets pulled in by composer, and exposes the script in vendor/bin. This would require composer to run first, but that wouldn't be so terrible, I think.

Am I completely on the wrong track?

@daniel that is what I have implemented, which in all honesty is just a few lines of code. A potential devil when using git is to fetch the proper branch, but beside that it is pretty much straight forward.

The composer library could be installed standalone and outside of the MediaWiki code base. Once the tool as fulfilled its mission (clone the repositories), it can be garbage collected out of the image and MediaWiki installation attempted (composer.local.json, install.php with extensions detection).

In T193824#5278541, @hashar wrote:

@daniel that is what I have implemented, which in all honesty is just a few lines of code. A potential devil when using git is to fetch the proper branch, but beside that it is pretty much straight forward.

So, what remains to be done here? Does your solution address the problem as stated in the task description?

I caused confusion since I though this task to be rather wide scoped, but it is primarily meant to find out how to install extensions and their dependent for the future MediaWiki containers. So for CI I am taking another approach (using Quibble for the instrumentation) and we will adjust later when start phasing out the current CI jobs / Quibble.

For the MediaWiki container that will run on Kubernetes, I have no idea how the code will be shipped though. There are too many unknown at this time. Most probably that would come with a hardcoded list of extensions and then verify all dependencies are fulfilled by using Kunal script maintenance/checkDependencies.php ( https://gerrit.wikimedia.org/r/503735 ).

In short: for CI I am covered by other means, for the deployment pipeline / kubernetes based container: the future still needs to be determined.

greg edited projects, added Release-Engineering-Team-TODO (201907); removed Release-Engineering-Team (Kanban).Jul 1 2019, 9:23 PM

greg moved this task from INBOX to Blocked externally on the Release-Engineering-Team-TODO (201907) board.Jul 1 2019, 9:23 PM

Change 502286 merged by jenkins-bot:
[integration/quibble@master] Clone requirements from ext dependencies

https://gerrit.wikimedia.org/r/502286

• WDoranWMF removed Legoktm as the assignee of this task.Jul 5 2019, 5:21 PM

• WDoranWMF removed a project: Platform Team Workboards (Doing).

In T193824#5258828, @daniel wrote:

I'd like to better understand the need, and the requirements. In particular: do we really need or even want automatic resolution of extension dependencies?

I don't know much about Blubber, but in my mind, I'd imagine something like this:

the blubber files has a list of extensions to install (perhaps including a repo URI for each)

<snip>

I've been operating under the assumption that at some stage we're going to have some kind of containers with just some extensions installed, and we'll need to resolve those dependencies for testing or whatever.

In production, presumably we'd give it the full list of extensions (like the make-wmf-branch config file) and then just assert that everything resolves properly.

In T193824#5316023, @Legoktm wrote:

I've been operating under the assumption that at some stage we're going to have some kind of containers with just some extensions installed, and we'll need to resolve those dependencies for testing or whatever.

I'm not sure that this would actually be very useful. AS far as I understand, goal here is to test the interoperability of extensions. Just enabling all of them and running all their tests in the combined environment doesn't seem like a great approach for that.

In my mind, we'd want to define scenarios with a specific setup of a specific set of extensions, and then write tests specifically for that setup. This would also allow us to test combinations of extensions that don't explicitly know or care about each other, and to test different configurations.

Otherwise we'll end up in a situation where we have to run all test for all extensions on any change to any extension. That seems wasteful and may still miss issues that depend on configuration, multi-site setups, etc.

Pinging @Physikerwelt for the Mathoid use case.

For example, the Math extension has no hard dependencies but optional dependencies on Wikibase, VisualEditor, Restbase. While we are already checking if restbase is available, checks for enabled extensions are not yet implemented. See

https://github.com/wikimedia/mediawiki-extensions-Math/blob/183c4fcc6f7d90ed08ba13979e76e33f96e50ae5/tests/phpunit/MathMLRdfBuilderTest.php#L31

I would wish (if there were unlimited developer resources, i.e., an optimal world), the information on optional dependencies could be modeled in the testing framework and the test would run in two groups: 1) tests that only require the current extension 2) tests that ensure the functionality together with other extensions.

... might be offtopic here, but I maintain my setup (dev and private wikis) via a very simple docker compose file https://github.com/physikerwelt/mediawiki-docker

In T193824#5316795, @daniel wrote:

In T193824#5316023, @Legoktm wrote:

I've been operating under the assumption that at some stage we're going to have some kind of containers with just some extensions installed, and we'll need to resolve those dependencies for testing or whatever.

I'm not sure that this would actually be very useful. AS far as I understand, goal here is to test the interoperability of extensions. Just enabling all of them and running all their tests in the combined environment doesn't seem like a great approach for that.

The idea would be to have a container with a subset of extensions installed. Extensions against which we could run both unit and integration tests. Defining specific integration tests against specific dependent extensions would be an exercise for extension developers after being able to define and resolve specific extension dependencies (as defined in their extension.json).

In my mind, we'd want to define scenarios with a specific setup of a specific set of extensions, and then write tests specifically for that setup. This would also allow us to test combinations of extensions that don't explicitly know or care about each other, and to test different configurations.

Otherwise we'll end up in a situation where we have to run all test for all extensions on any change to any extension. That seems wasteful and may still miss issues that depend on configuration, multi-site setups, etc.

The ideal of using Blubber and the entire Deployment Pipeline for extensions and for MediaWiki is to provide for rapid staged feedback. For example, as a first commit stage, running unit tests for a particular extension then, after having passed or failed that stage, running integration tests that exercise dependent extensions.

Testing of a set of extensions destined to end up in production together that are possibly not aware of one another is for a different stage of integration -- a staging environment that runs a series of smoke tests, for instance, could satisfy that -- this is yet to be defined.

The scope of this specific project is to allow an extension developer to create integration tests (and possibly unit tests if that's needed) that can run against their extension alongside its dependencies (and, perhaps, its optional dependencies if the developer feels that that should gate the extension).

Mentioned in SAL (#wikimedia-releng) [2019-07-25T14:52:18Z] <hashar> Tagged Quibble 0.0.33 b2f9e36 | T193824 T87781 T199116