Page MenuHomePhabricator

Revise Tone: Instrumentation
Open, HighPublic3 Estimated Story Points

Description

User story & summary:

As the Growth team, I want instrumentation set up for the Revise Tone task so that we can measure how newcomers engage with it and evaluate its impact on new editors.

Details:

  • Implement instrumentation to track user interactions with the Revise Tone Structured Task. Events should capture entry, task interaction, and completion flows to support evaluation of feature adoption and effectiveness.
  • Partner with product Analytics to decide on instrumentation details: T397247: Measurement Plan: Revise Tone Structured Task (WE1.1, FY25-26)
Acceptance criteria:
  • Implement instrumentation necessary to support the Revise Tone Measurement Plan / Revise Tone Instrumentation Specs
    • The following things should be instrumented with xLab/Test-Kitchen:
      • clicks (+if time allows impressions) on a Revise Tone task on the Homepage
      • clicks (+if time allows impressions) on the save buttons (both in VE directly and in the edit-comment dialog) while in a Revise Tone Task
      • clicks on the "Revise" button
      • clicks/submits of a declined Revise Tone Edit Check with the respective reason
  • We collect neither IP addresses nor Browser User Agents
NOTE: This task excludes instrumentation needed for the onboarding quiz. That will be covered in: T406252: πŸ§‘β€πŸ’» Instrument the Revise Tone Onboarding Quiz

Details

Related Changes in Gerrit:
SubjectRepoBranchLines +/-
mediawiki/extensions/GrowthExperimentsmaster+43 -69
mediawiki/extensions/GrowthExperimentswmf/1.46.0-wmf.4+43 -69
mediawiki/extensions/GrowthExperimentsmaster+24 -68
mediawiki/extensions/GrowthExperimentsmaster+6 -40
operations/mediawiki-configmaster+1 -1
mediawiki/extensions/GrowthExperimentsmaster+4 -2
operations/mediawiki-configmaster+0 -2
mediawiki/extensions/GrowthExperimentsmaster+24 -2
mediawiki/extensions/GrowthExperimentswmf/1.46.0-wmf.3+7 -1
mediawiki/extensions/GrowthExperimentsmaster+45 -0
mediawiki/extensions/GrowthExperimentsmaster+7 -1
operations/mediawiki-configmaster+34 -0
mediawiki/extensions/GrowthExperimentsmaster+38 -4
mediawiki/extensions/GrowthExperimentsmaster+98 -0
Show related patches Customize query in gerrit

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
KStoller-WMF updated the task description. (Show Details)
KStoller-WMF moved this task from Inbox to Blocked on the Growth-Team board.

(Moving to blocked for now, as we will need to start T397247: Measurement Plan: Revise Tone Structured Task (WE1.1, FY25-26) before engineers can start instrumentation.)

KStoller-WMF renamed this task from Revise Tone: Instrumentation to Revise Tone: Instrumentation.Nov 2 2025, 9:07 PM
KStoller-WMF raised the priority of this task from Medium to High.
KStoller-WMF updated the task description. (Show Details)
KStoller-WMF set the point value for this task to 3.Nov 3 2025, 5:15 PM

FWIW, the homepagevisit and homepagemodule schemas (and consequently, newcomertasks, which is linked to the previous one) currently collect client IP address. This is something we opted into (a long time ago). In addition to this, all schemas log the User-Agent. Both User-Agent and IP address are considered personal information by our privacy policy. Since we also collect unhashed user ID, we would very likely fail the last low risk criteria, which would make the collecting a Medium risk => requiring L3SC approval.

@Michael said we do not need neither UA nor IP address, so it might make sense to remove that instead, but currently, this is something we need to work against.

FWIW, the homepagevisit and homepagemodule schemas (and consequently, newcomertasks, which is linked to the previous one) currently collect client IP address. This is something we opted into (a long time ago). In addition to this, all schemas log the User-Agent. Both User-Agent and IP address are considered personal information by our privacy policy. Since we also collect unhashed user ID, we would very likely fail the last low risk criteria, which would make the collecting a Medium risk => requiring L3SC approval.

@Michael said we do not need neither UA nor IP address, so it might make sense to remove that instead, but currently, this is something we need to work against.

As I understand the comment it is assuming that we would be using these three schemas (homepagevisit, homepagemodule,newcomertasks), but it was requested from @Iflorez to use Test Kitchen as much as possible. So my idea was to use web base schema instead to ensure we remain on a low risk tier. Does this make sense?

Change #1203812 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[operations/mediawiki-config@master] EventStramConfig: add stream for Growth's revise tone experiment

https://gerrit.wikimedia.org/r/1203812

@Iflorez, I have some questions about the instrumentation spec:

  • Constructive edit rate: the spec for this event says edit (made on mobile web within a user's first 24 hours [where the edit is not reverted within 48 hrs]), I am assuming this restrictions will be ensured in post analysis rather than by the instrumentation itself, because I'm seeing time criteria applied in the constructive edit rate metric query, for instance INTERVAL 48h, is this assumption correct?
  • Is the assumption above also correct for the rest of metrics that use the edit_save event data? Or said differently, the edit_save event needs to be recorded for all kinds of edits performed by users in the experiment sample (both groups) and during the experiment. Regardless of their account age, edit count, etc.
  • The task rejection rate requires to record a page-visited event, however the task completion rate requires a Click on the RT card itself which launches the onboarding event. Is this intentional? Shouldn't the denominator for these two rates be the same one: clicks on the RT card? If not, could you clarify which "page visited" are we supposed to record?

`

Hola @Sgs:

  • Constructive edit rate: the spec for this event says edit (made on mobile web within a user's first 24 hours [where the edit is not reverted within 48 hrs]), I am assuming this restrictions will be ensured in post analysis rather than by the instrumentation itself, because I'm seeing time criteria applied in the constructive edit rate metric query, for instance INTERVAL 48h, is this assumption correct?

Yes, the Constructive Edit rate metric will be calculated in the analysis/query. In analysis we can pull all edit_save events and determine whether an edit qualifies as a constructive edit, based on timing rules and revert status. To calculate Constructive Edit rate we need interaction data including edit_saved.

  • Is the assumption above also correct for the rest of metrics that use the edit_save event data? Or said differently, the edit_save event needs to be recorded for all kinds of edits performed by users in the experiment sample (both groups) and during the experiment. Regardless of their account age, edit count, etc.

Yes; you can see the constructive_edit_rate calculation and how it's using edit_save.

  • The task rejection rate requires to record a page-visited event, however the task completion rate requires a Click on the RT card itself which launches the onboarding event. Is this intentional? Shouldn't the denominator for these two rates be the same one: clicks on the RT card? If not, could you clarify which "page visited" are we supposed to record?

Indeed these are defined differently/separately and are not mirror concepts. Task Rejection rate captures a click on decline from the edit-check-1 view (the name used in the design spec) and Task Completion rate captures a wider start ---> end trajectory. cc @KStoller-WMF

Change #1204840 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[mediawiki/extensions/GrowthExperiments@master] WIP: revise tone instrumentation

https://gerrit.wikimedia.org/r/1204840

Change #1205179 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[mediawiki/extensions/GrowthExperiments@master] analytics(ReviseTone): add configured experiment instrument clients

https://gerrit.wikimedia.org/r/1205179

Change #1205179 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] analytics(ReviseTone): add configured experiment instrument clients

https://gerrit.wikimedia.org/r/1205179

Change #1205188 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[mediawiki/extensions/GrowthExperiments@master] analytics(ReviseTone): log an edit_save event for users in experiment

https://gerrit.wikimedia.org/r/1205188

Hola @Sgs:

  • Constructive edit rate: the spec for this event says edit (made on mobile web within a user's first 24 hours [where the edit is not reverted within 48 hrs]), I am assuming this restrictions will be ensured in post analysis rather than by the instrumentation itself, because I'm seeing time criteria applied in the constructive edit rate metric query, for instance INTERVAL 48h, is this assumption correct?

Yes, the Constructive Edit rate metric will be calculated in the analysis/query. In analysis we can pull all edit_save events and determine whether an edit qualifies as a constructive edit, based on timing rules and revert status. To calculate Constructive Edit rate we need interaction data including edit_saved.

Is this also the case to determine if an edit was a revise-toneone or not? I cannot see any revise-tone tag in the metric catalog queries so I'm wondering if we need this information somewhere in the instrumentation or it will only be necessary on manual analysis instead of automated and that's why I cannot find it in the queries.

Change #1204840 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] analytics(ReviseTone): add experiment instrumentation

https://gerrit.wikimedia.org/r/1204840

Hola @Sgs:

  • Constructive edit rate: the spec for this event says edit (made on mobile web within a user's first 24 hours [where the edit is not reverted within 48 hrs]), I am assuming this restrictions will be ensured in post analysis rather than by the instrumentation itself, because I'm seeing time criteria applied in the constructive edit rate metric query, for instance INTERVAL 48h, is this assumption correct?

Yes, the Constructive Edit rate metric will be calculated in the analysis/query. In analysis we can pull all edit_save events and determine whether an edit qualifies as a constructive edit, based on timing rules and revert status. To calculate Constructive Edit rate we need interaction data including edit_saved.

Is this also the case to determine if an edit was a revise-toneone or not? I cannot see any revise-tone tag in the metric catalog queries so I'm wondering if we need this information somewhere in the instrumentation or it will only be necessary on manual analysis instead of automated and that's why I cannot find it in the queries.

Short answer: no, we do not need to add revise-tone tag info to the instrumentation build out. That information will only be utilized in the manual analysis.

In more detail: To determine if an edit was a revise-tone edit we'll perform a manual analysis with data pulled from the Test Kitchen table on the event database (using revision_id, wiki, namespace id, and revision date range information) and the event.mediawiki_revision_tags_change table (using database, rev_timestamp, rev_id, tags). See cells F14 to F16 in the instrumentation specification.

Adding this quick update for tracking for future reference:

See the latest on the new Test Kitchen stream and stream naming discussion related to https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/1203812

Change #1203812 merged by jenkins-bot:

[operations/mediawiki-config@master] EventStreamConfig: add stream for Growth and Editing team edit rates

https://gerrit.wikimedia.org/r/1203812

Mentioned in SAL (#wikimedia-operations) [2025-11-18T14:30:52Z] <sgimeno@deploy2002> Started scap sync-world: Backport for [[gerrit:1203812|EventStreamConfig: add stream for Growth and Editing team edit rates (T405177)]]

Mentioned in SAL (#wikimedia-operations) [2025-11-18T14:36:00Z] <sgimeno@deploy2002> sgimeno: Backport for [[gerrit:1203812|EventStreamConfig: add stream for Growth and Editing team edit rates (T405177)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2025-11-18T14:40:57Z] <sgimeno@deploy2002> Finished scap sync-world: Backport for [[gerrit:1203812|EventStreamConfig: add stream for Growth and Editing team edit rates (T405177)]] (duration: 10m 05s)

Change #1206891 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[mediawiki/extensions/GrowthExperiments@master] fix(ReviseToneExperimentInteractionLogger): prevent breaking homepage for unsampled users

https://gerrit.wikimedia.org/r/1206891

Change #1206891 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] fix(ReviseToneExperimentInteractionLogger): prevent breaking homepage for unsampled users

https://gerrit.wikimedia.org/r/1206891

Change #1206936 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[mediawiki/extensions/GrowthExperiments@wmf/1.46.0-wmf.3] fix(ReviseToneExperimentInteractionLogger): prevent breaking homepage for unsampled users

https://gerrit.wikimedia.org/r/1206936

Change #1205188 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] analytics(ReviseTone): log an edit_save event for users in experiment

https://gerrit.wikimedia.org/r/1205188

Moving this to QA while I self-QA it and get @Iflorez's feedback.

Change #1206936 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@wmf/1.46.0-wmf.3] fix(ReviseToneExperimentInteractionLogger): prevent breaking homepage for unsampled users

https://gerrit.wikimedia.org/r/1206936

Mentioned in SAL (#wikimedia-operations) [2025-11-19T14:30:50Z] <sgimeno@deploy2002> Started scap sync-world: Backport for [[gerrit:1206936|fix(ReviseToneExperimentInteractionLogger): prevent breaking homepage for unsampled users (T405177)]], [[gerrit:1207118|fix(MigrateMentorStatusAway): ensure migration respects date format (T409170)]]

Mentioned in SAL (#wikimedia-operations) [2025-11-19T14:35:28Z] <sgimeno@deploy2002> sgimeno: Backport for [[gerrit:1206936|fix(ReviseToneExperimentInteractionLogger): prevent breaking homepage for unsampled users (T405177)]], [[gerrit:1207118|fix(MigrateMentorStatusAway): ensure migration respects date format (T409170)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2025-11-19T14:40:00Z] <sgimeno@deploy2002> Finished scap sync-world: Backport for [[gerrit:1206936|fix(ReviseToneExperimentInteractionLogger): prevent breaking homepage for unsampled users (T405177)]], [[gerrit:1207118|fix(MigrateMentorStatusAway): ensure migration respects date format (T409170)]] (duration: 09m 09s)

Change #1207825 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[mediawiki/extensions/GrowthExperiments@master] ReviseToneExperimentEditIngress: add debug statements

https://gerrit.wikimedia.org/r/1207825

Change #1207825 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] ReviseToneExperimentEditIngress: add debug statements

https://gerrit.wikimedia.org/r/1207825

Change #1207882 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[operations/mediawiki-config@master] EventStreamConfig: drop revision and namespace id from contributors.experiments

https://gerrit.wikimedia.org/r/1207882

Change #1207883 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[mediawiki/extensions/GrowthExperiments@master] instrumentation(ReviseTone): manually collect revision and namespace

https://gerrit.wikimedia.org/r/1207883

Change #1207882 merged by jenkins-bot:

[operations/mediawiki-config@master] EventStreamConfig: drop revision and namespace id from contributors.experiments

https://gerrit.wikimedia.org/r/1207882

Change #1208303 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[mediawiki/extensions/GrowthExperiments@master] instrumentation(ReviseTone): add SE enabled to context on homepage visits

https://gerrit.wikimedia.org/r/1208303

Change #1207883 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] instrumentation(ReviseTone): manually collect revision and namespace

https://gerrit.wikimedia.org/r/1207883

Change #1210526 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[operations/mediawiki-config@master] [beta] GrowthExperiments: increase to log level to debug

https://gerrit.wikimedia.org/r/1210526

Change #1210526 merged by jenkins-bot:

[operations/mediawiki-config@master] [beta] GrowthExperiments: increase log level to debug

https://gerrit.wikimedia.org/r/1210526

Change #1210610 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[mediawiki/extensions/GrowthExperiments@master] anlytics(ReviseTone): use ReviseToneExperimentInteractionLogger instead of experiment

https://gerrit.wikimedia.org/r/1210610

Change #1210619 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[mediawiki/extensions/GrowthExperiments@master] analytics(ReviseTone): track suggestion-shown interaction

https://gerrit.wikimedia.org/r/1210619

Change #1210610 abandoned by Sergio Gimeno:

[mediawiki/extensions/GrowthExperiments@master] anlytics(ReviseTone): use ReviseToneExperimentInteractionLogger instead of experiment

Reason:

Squashed in I6dbb137f32ca88fd1baeecd89e546a41235ea8d8

https://gerrit.wikimedia.org/r/1210610

Change #1208303 abandoned by Sergio Gimeno:

[mediawiki/extensions/GrowthExperiments@master] instrumentation(ReviseTone): fix stream for edits and refine exposure

Reason:

Messed change-ids, squashed in https://gerrit.wikimedia.org/r/c/mediawiki/extensions/GrowthExperiments/+/1210619

https://gerrit.wikimedia.org/r/1208303

Change #1212128 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[mediawiki/extensions/GrowthExperiments@wmf/1.46.0-wmf.4] instrumentation(ReviseTone): fix stream for edits and refine exposure

https://gerrit.wikimedia.org/r/1212128

Change #1212128 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@wmf/1.46.0-wmf.4] instrumentation(ReviseTone): fix stream for edits and refine exposure

https://gerrit.wikimedia.org/r/1212128

Mentioned in SAL (#wikimedia-operations) [2025-11-27T14:17:58Z] <sgimeno@deploy2002> Started scap sync-world: Backport for [[gerrit:1212106|fix(ReviseTone): only initialize once]], [[gerrit:1212108|fix(ReviseTone): render behind EditNotice on mobile]], [[gerrit:1212128|instrumentation(ReviseTone): fix stream for edits and refine exposure (T405177 T406252)]]

Mentioned in SAL (#wikimedia-operations) [2025-11-27T14:19:53Z] <sgimeno@deploy2002> sgimeno, migr: Backport for [[gerrit:1212106|fix(ReviseTone): only initialize once]], [[gerrit:1212108|fix(ReviseTone): render behind EditNotice on mobile]], [[gerrit:1212128|instrumentation(ReviseTone): fix stream for edits and refine exposure (T405177 T406252)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Change #1210619 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] instrumentation(ReviseTone): fix stream for edits and refine exposure

https://gerrit.wikimedia.org/r/1210619

Mentioned in SAL (#wikimedia-operations) [2025-11-27T14:28:43Z] <sgimeno@deploy2002> Finished scap sync-world: Backport for [[gerrit:1212106|fix(ReviseTone): only initialize once]], [[gerrit:1212108|fix(ReviseTone): render behind EditNotice on mobile]], [[gerrit:1212128|instrumentation(ReviseTone): fix stream for edits and refine exposure (T405177 T406252)]] (duration: 10m 46s)