Page MenuHomePhabricator

Add MetricsPlatform to gated extensions
Closed, DeclinedPublic

Description

MetricsPlatform aims to become the standard tool for experimentation and analytics around WMF products. It's broadly enbled in wmf wikis and several extensions may make use of it. For GrowthExperiments and other extension which have used its own experimentation tooling it would be re-assuring to test the integration with MP/xLab while they migrate from one tool to the next. For GrowthExperiments it's particularly helpful since GE is also a gated extension. For these arguments, we believe the inclusion it's justified.

  • The tests for the extension run quickly ( running tests under extensions/MetricsPlatform/tests/phpunit/ is about ~6s)
  • GrowthExperiments is a gated extensions and plans to make extensive use of MP
  • The extension is an important part of the analytics infrastructure on WMF products.

Event Timeline

Change #1185945 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[integration/config@master] zuul: [mediawiki/extensions/MetricsPlatform] Add to gate

https://gerrit.wikimedia.org/r/1185945

Sgs triaged this task as Medium priority.Sep 9 2025, 10:34 AM
Sgs updated the task description. (Show Details)
Sgs added a subscriber: phuedx.

@Jdforrester-WMF wrote on Gerrit
This needs a significant discussion first, and then sign-off from Release Engineering. Where is the Phab task?

While I understand the general sentiment about not adding stuff to gate unless we really need to, I'm not sure I see any reasons to not proceed with this task in particular.

As far as I understand things, MetricsPlatform is supposed to be _the_ way how teams are going to implement A/B testing. I think noticing it breaking (because of changes to other extensions) are fairly important. Not doing this also means GrowthExperiments (which is itself gated) has to skip tests for anything that needs MetricsPlatform, which adds to the risk of not noticing breakages.

I think those are fairly strong arguments for us to go ahead with the gating, but I am curious to hear any counterarguments @Jdforrester-WMF (or anyone else) might have.

The decision is not mine to make, but Release Engineering.

See T403560#11150311 for the same set of concerns for a different repo, and to check that you're actually asking for and accepting what this task proposes.

Bluntly: Are you absolutely sure that you want to sign up the team supporting Metrics Platform for being pinged as a UBN! forever for every repo in the MediaWiki-verse (e.g. SemanticMediaWiki) when their code breaks and they think it's your fault? Do you want to be reviewing patches from potentially hundreds of such repos? This has been a cause of a fair bit of unexpected work for teams in the past.

From chat: @thcipriani can you give additional clarity or guidance on next steps here from a releng POV? From there, this can inform @phuedx / @Milimetric who would make the call as to whether to make this change, and we can move forward once we know whether the trade-offs are acceptable

There is no formal process for how to add extensions as gated extensions as gated extensions as a concept pre-date that level of organizational maturity.

That said, my perspective is that gated extensions should be limited as they have a cost. Quoting myself:

Adding an extension as a gated extension will slow down everyone’s development processes in a few ways—one obvious, one subtle, one insidious:

  • Obvious slowdown – the test will take time. Anytime anyone pushes a change for an extension, your tests will run, they will take time.
  • Subtle slowdown – the amount of time lost to breakage. Anytime anyone pushes a change for an extension, it may cause your tests to fail. And when those tests fail, they’ll require your expertise to unblock it. For production, this means 500 times a week, every gated extension is directly in the critical path to production, which means some human is standing in the path to production. This leads to frustration, which causes...
  • Insidious slowdown – the amount of time lost to folks frustrated by extensions in gated extensions rail about removing an extension due to the other kinds of slowdowns.

Folks have become wary of adding extensions because of the slowdowns above and other, subtle issues.

One not-so-obvious example is CentralAuth—where tests for central auth conflict with and shadow core tests. Lots of extensions used somewhere in production conflict with other parts of MediaWiki we care about (or other extensions in production).

@Jdforrester-WMF is doing his due diligence to advise against them.

One thing to consider, based on the task, is whether adding a bi-directional dependency on GrowthExperiments makes sense as a first step?

Hey folks!

While you’re digging into the specifics here, @thcipriani looped me into this conversation. As I continue to get up to speed at the foundation, I see this as an area where I might help bring some structure and organization to the process. It sounds like we’re largely aligned on that goal already. Would anyone have concerns if I created a task in Quality-and-Test-Engineering-Team (Quality Engineering) so we can continue exploring the process at a more general level?

Would anyone have concerns if I created a task in Quality-and-Test-Engineering-Team (Quality Engineering) so we can continue exploring the process at a more general level?

I was already working on writing one when you posted your comment, so apologies for jumping the gun here: T404403: Better prevent changes in MW core from breaking WMF production extensions. Feel free to revise that to better capture the overall issue.

Bluntly: Are you absolutely sure that you want to sign up the team supporting Metrics Platform for being pinged as a UBN! forever for every repo in the MediaWiki-verse (e.g. SemanticMediaWiki) when their code breaks and they think it's your fault? Do you want to be reviewing patches from potentially hundreds of such repos? This has been a cause of a fair bit of unexpected work for teams in the past.

My understanding is that this is not entirely correct: as far as I can tell, gated extensions' tests are only run against each other, not against non-gated extensions (so not against SMW for example, or even against e.g. CentralAuth).

@thcipriani and I met to discuss this and we finished agreeing that perhaps mutual dependence in CI is the correct way to go for now.

It is true that we are aiming to have MetricsPlatform (which I'll call xLab) become the de facto way to run experiments at the WMF. However, my understanding is that, at least for the foreseeable future, experiment code will reside in one of a small set of experiment-focussed extensions. In order for those extensions to benefit from this they would also have to be made to be gated extensions. The impression that I got from Tyler is that the gated extension mechanism is clunky and so this would be an undesirable outcome.

The advantage of making xLab a gated extension is that its test suite would run with the test suites of all other gated extensions and so we should be able to detect breakages before they make it into production. In principle, I agree. I mentioned to @Sgs that extensions are not in-scope for the Stable Interface Policy by default some time last week. Would the gated extension mechanism help us detect backwards incompatible changes in the xLab SDKs? Yes. Then again, so would:

  1. Static analysis
  2. The Experiment Platform team (who I'll call the team) explicitly opting into the Stable Interface Policy for the xLab SDKs

But (1) is a little nuanced. We have Phan for PHP. So (1) will catch the team failing to enforce (2) during their own code review processes for the xLab PHP SDK. But what about the xLab JS SDK? I've thought about this a bunch and I believe the best way an equally-good way to catch backwards-incompatible changes to the xLab JS SDK would be to discourage stubbing or mocking it by providing an internally-stubbed version in the test environment. The test engineer would declare experiments in the test setup code and the xLab JS SDK would return them but, importantly, you would be interacting with the xLab JS SDK.

The other class of backwards-incompatible change that I've considered is changes to subject enrollment. I immediately discarded this as, frankly, experiment implementers should not be testing the xLab enrollment code. The enrollment code is covered and continuously tested by experiments running in production.

If I've misunderstood the value that making xLab a gated extension would provide, then please let me know. For now though, I'll write up the following tasks for the team:

  1. Make the MetricsPlatform extension depend on the GrowthExperiments extension in CI
  2. Explicitly opt into the Stable Interface Policy for the xLab SDKs
  3. Provide a stub xLab SDK for use in JS unit and integration tests

As these are related to OKR work, I will prioritise these as High and so the team should be able to work on them immediately/very soon. At the very least, we should do (1) immediately and then revisit it once we've done (3).

JVanderhoop-WMF subscribed.

Closing this task, as we are not making metrics platform a gated extension.