Page MenuHomePhabricator

Stop overriding TestKitchen JS config vars on account creation
Closed, ResolvedPublic

Description

In T401308 an ad-hoc experiment enrollment was introduced in order to log the experiment exposure metric from newcomers in our grafana dashboard. In order to do so the TestKitchen experiment manager initialization steps are replicated in GrowthExperiments' LocalUserCreated hook, see ExperimentXLabManager.php#48. This is bad because it makes the result of wgMetricsPlatformUserExperiments nondeterministic for other extensions depending on hook order execution. Also because it makes GrowthExperiments responsible for something it should not be at all.

This hack has to be removed but keeping the grafana dashboards is maybe desirable. In T410896: Expose Experiment Enrollment Sampling method TestKitchen exposes the notion of a Coordinator that could be suitable for this use-case.

Open questions

  • How much are this charts used and how useful are they?

Acceptance criteria

  • GrowthExperiments does not initialize TestKitchen's manager
  • GrowthExperiments does not override TestKitchen's JS config vars
  • [To clatrify] GrowthExperiments logs experiment exposure for newcomers to grafana
  • Related test code is also removed

Event Timeline

Change #1219179 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[mediawiki/extensions/GrowthExperiments@master] [WIP] ExperimentManager: avoid enrollment override

https://gerrit.wikimedia.org/r/1219179

Change #1219578 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[mediawiki/extensions/GrowthExperiments@master] test(ExperimentXLabManager): ensure enrollment works as expected

https://gerrit.wikimedia.org/r/1219578

Change #1219578 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] test(ExperimentXLabManager): ensure enrollment works as expected

https://gerrit.wikimedia.org/r/1219578

Change #1219179 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] ExperimentManager: avoid re-enrollment override

https://gerrit.wikimedia.org/r/1219179

I think keeping the Account creation section in the KPIs dashboard is ok for observing account creation trends in general and per experiment group account creation for logged-in experiments. These charts currently record how many users are enrolled into an experiment and get a group assignment at the time of account creation. All experiments we've done so far were targeting logged-in users and using mw-user as the identifier type in the experiment config so that made some sense. Although it was also a partial metric, many more users were part of the experiment because enrollment happened on Homepage visits for users < 100 edits. So, how useful were these charts for that experiment?

For the upcoming logged-out experiments these charts are gonna be useless, or are gonna be wrong. That's because the artificial re-enrollment for logged in users that GE performs has no relation with whatever re-enrollment will have happened already using edge-unique as the identifier type in the experiment config. So it will try to re-enroll for a logged in experiment and rather not find any experiment to match, or return a non-necessarily matching group.

I think some options are:

  1. Remove the per-group charts and just have a single all-groups account creation chart
  2. Keep them as they are but improve the title and help to read them: eg: Newcomer experiment exposure
  3. Refactor the section to be meaningful for both logged-in and logged-out experiments and strict definition of newcomer vs loose definition ( 0 edits vs < 100 edits ). This requires also code changes.

What do you think @KStoller-WMF @Iflorez @Michael ?

I think keeping the Account creation section in the KPIs dashboard is ok for observing account creation trends in general and per experiment group account creation for logged-in experiments. These charts currently record how many users are enrolled into an experiment and get a group assignment at the time of account creation. All experiments we've done so far were targeting logged-in users and using mw-user as the identifier type in the experiment config so that made some sense. Although it was also a partial metric, many more users were part of the experiment because enrollment happened on Homepage visits for users < 100 edits. So, how useful were these charts for that experiment?

For the upcoming logged-out experiments these charts are gonna be useless, or are gonna be wrong. That's because the artificial re-enrollment for logged in users that GE performs has no relation with whatever re-enrollment will have happened already using edge-unique as the identifier type in the experiment config. So it will try to re-enroll for a logged in experiment and rather not find any experiment to match, or return a non-necessarily matching group.

I think some options are:

  1. Remove the per-group charts and just have a single all-groups account creation chart
  2. Keep them as they are but improve the title and help to read them: eg: Newcomer experiment exposure
  3. Refactor the section to be meaningful for both logged-in and logged-out experiments and strict definition of newcomer vs loose definition ( 0 edits vs < 100 edits ). This requires also code changes.

What do you think @KStoller-WMF @Iflorez @Michael ?

Thank you for outlining these options!

I'm not sure how useful these charts are anymore, unless as a backup check for us. In the test-kitchen automated experiment analysis, we should already see how users are enrolled in the experiment variants.

Also, with our upcoming changes, we will likely have users enrolled in more than just one experiment at the same time. How should we deal with that in these charts? In principle, having 1 chart per variant should still scale fine, we just need to keep in mind that all the charts together will sum up to more than 100%.

Could we have a simpler logic for this?

  1. Loop through all the valid experiments that we have going on
  2. for each record experiment name and experiment group (control/treatment/unsampled) as 2 labels
  3. have one chart per experiment showing the distribution of the associated groups

What do you think?

Could we have a simpler logic for this?

  1. Loop through all the valid experiments that we have going on
  2. for each record experiment name and experiment group (control/treatment/unsampled) as 2 labels
  3. have one chart per experiment showing the distribution of the associated groups

What do you think?

That makes sense to me, however I'm not sure if by _record experiment name and experiment group_ you mean to keep only recording new account registration assignments aka newcomer-refined experiment exposure or, if what we want is real experiment exposure which should be then recorded on page impressions. The second one is tricky as enrollment happens on each impression but we lack an easy way to check if the user was already recorded. I understand this is done in post analysis using some uniq/distinct operators by user id in the query, but I don't know how to that without relying in some persistent storage. If we're just having newcomers experiment exposure as a somewhat KPI metric; that's simple and I can file a task for having per-experiment charts instead of the current setup.

Sgs added a subscriber: phuedx.

Since repurposing the dashboard as @Michael suggests may have its own challenges I've filed T416583 so we can plan when to do it. Also @phuedx has confirmed TestKitchen will drive the changes from T413983 in a backwards compatible way so our current charts are not affected by TK releases.