Page MenuHomePhabricator

MinT for Wiki Readers: pagevisit instrumentation for experiment
Closed, ResolvedPublic4 Estimated Story Points

Description

Comparison of user retention between control and treatment group is one of the primary metrics for MinT for Wiki Readers experiment (WE 3.1.5 for FY25/26). We will need additional instrumentation to track, and monitor this as part of the xLab Automated Analytics. The following actions to be instrumented:

When subject from either control or treatment visits any page

action: page_visited

The events can be sent to the stream: mediawiki.product_metrics.translation_mint_for_readers.experiments and the latest schema version: /analytics/product_metrics/web/translation/1.4.2.

The code will be part of WikimediaEvents extension and for reference this is the PageVisit instrument used for Synthetic A/A test.


Ignore the following (please see the discussion); keeping it for archival

When subject from either control or treatment visits any page

action: page_visited
action_subtype: initial_visit

When the subject visits and scrolls (the first time)

action: page_scrolled
action_subtype: initial_scroll

When the subjects visits and stays for at least 3 seconds (in visible state)

action: page_visited
action_subtype: visited_3secs

The primary page visit event is the important one, and the other two can considered if time and bandwidth permits.

Event Timeline

@mpopov I would like to your thoughts/review on the above proposed instrumentation (from our conversation last week).

Also, in the PageVisit instrument used for Synthetic A/A test, the experiment name is hard coded into the code (const EXPERIMENT_NAME = 'sds2-4-11-synth-aa-test';)
Do we have to do the same for MinT for Wiki Readers, or will it automatically be populated by xLab if we launch the experiment through it?

The primary page visit event is the important one, and the other two can considered if time and bandwidth permits.

It doesn't sound like those other two are important, so I recommend just cutting them.

action: page_visited
action_subtype: initial_visit

Do you need action_subtype? Would the instrument keep track of visits? Would it know if the current visit is not an initial visit and set action_subtype to return_visit? Where would it persist this knowledge? Would it persist across multiple sessions from the same subject?

You should probably determine initial vs returning at analysis time, rather than data collection time.

Also, in the PageVisit instrument used for Synthetic A/A test, the experiment name is hard coded into the code (const EXPERIMENT_NAME = 'sds2-4-11-synth-aa-test';)
Do we have to do the same for MinT for Wiki Readers, or will it automatically be populated by xLab if we launch the experiment through it?

No, the experiment name is the identifier that links deployed experiment code to experiment configuration in xLab to experiment configuration in automated analytics.

So yes, it has to be hardcoded into the specific implementation of the experiment. When we ran the second synth A/A test (as a separate experiment, T397138), we had to change the name https://gerrit.wikimedia.org/r/c/mediawiki/extensions/WikimediaEvents/+/1160475/1/modules/ext.wikimediaEvents/xLab/pageVisit.js and deploy the code change.

You don't have to separate it out like we did, though, and the code can just be const mintExperiment = mw.xLab.getExperiment( 'fy25-26-we-3-1-6-mint-readers' ); if fy25-26-we-3-1-6-mint-readers is the machine-readable name by which you want to identify the experiment.

I created a separate experiment-focused instrumentation spec template which includes these details, if you want to use that. Alternatively, feel free to borrow the table from the description of T397143: Run a logged-in synthetic A/A test using the PHP SDK

It doesn't sound like those other two are important, so I recommend just cutting them.

Yes, to start with they are not important. For feature retention within MinT for Readers, after a user opens the automatic translation, we're planning to use at least one section expanded or spent at least three seconds on the page, to be able to reasonably the say the user intended to read the article, and differentiate users who accidentally open or just curious to try out a new feature, without actually using it in the intended way. I was thinking if we could use similar measures for retention here as well, but it can be kept to minimum initially.

Do you need action_subtype? Would the instrument keep track of visits? Would it know if the current visit is not an initial visit and set action_subtype to return_visit? Where would it persist this knowledge? Would it persist across multiple sessions from the same subject?

If we only keep the simple page visit event and drop the rest, a sub-type is not really required.

You should probably determine initial vs returning at analysis time, rather than data collection time.

Agreed.

I created a separate experiment-focused instrumentation spec template which includes these details, if you want to use that. Alternatively, feel free to borrow the table from the description of T397143: Run a logged-in synthetic A/A test using the PHP SDK

Thank you! That's a good idea to create an experiment specific instrument spec.

Nikerabbit triaged this task as Medium priority.Jul 8 2025, 6:19 AM
Nikerabbit moved this task from Backlog to Product integration on the MinT board.
Nikerabbit set the point value for this task to 4.Jul 9 2025, 7:20 AM

Change #1171190 had a related patch set uploaded (by Huei Tan; author: Huei Tan):

[mediawiki/extensions/ContentTranslation@master] MinT: pagevisit instrumentation for experiment

https://gerrit.wikimedia.org/r/1171190

Change #1171190 abandoned by Huei Tan:

[mediawiki/extensions/ContentTranslation@master] MinT: pagevisit instrumentation for experiment

Reason:

this should be done in Wikimedia Events extension, not here.

https://gerrit.wikimedia.org/r/1171190

Change #1172332 had a related patch set uploaded (by Huei Tan; author: Huei Tan):

[mediawiki/extensions/WikimediaEvents@master] MinT: add an experiment-specific instrument for mint readers

https://gerrit.wikimedia.org/r/1172332

Change #1172332 abandoned by Huei Tan:

[mediawiki/extensions/WikimediaEvents@master] MinT: add an experiment-specific instrument for mint readers

Reason:

drop

https://gerrit.wikimedia.org/r/1172332

Change #1172332 restored by Huei Tan:

[mediawiki/extensions/WikimediaEvents@master] MinT: add an experiment-specific instrument for mint readers

https://gerrit.wikimedia.org/r/1172332

@KCVelaga_WMF the description mention the WE 3.1.1, can you confirm this is WE 3.1.6 or WE 3.1.1?

If we are going to use fy25-26-we-3-1-6-mint-readers as the experiment name, this is WE 3.1.6.

Or, is this 3.1.5?

WE3.1.5
If we provide web readers the option to view a machine translated version of Wikipedia content unavailable in their language, we'll learn if reading activity is increased, measured as a 3% increase in page interactions, drawing readers to the local language wiki with a potential increase in local editing activity. This will be provided as a controlled A/B testing setting for no longer than 6 months, and in 13 Wikipedias with prior consent, using open machine translation services already available to Wikipedia editors.

@hueitan thanks for pointing that. Please use fy25-26-we-3-1-5-mint-readers

The KR number got changed a few times. But, 3.1.5 is the latest.

Need the clarification:

For the current Synthetic A/A test, are we ONLY interested in tracking the page visit event, and
not the entry points to visit MinT wiki readers (which would be handled in a later A/B test)

If i understand it correct, i should move the page visit event instrument code (code) into the Content Translation's MinT landing page ensure accurate tracking for the A/A test because there is no assigned group in the config yet?

Once the A/A test is validated on production, is the plan to then implement feature toggling and clickthrough rate instrumentation for the A/B test through the entry points (article footer/language selector)?

@KCVelaga_WMF @mpopov

For the current Synthetic A/A test, are we ONLY interested in tracking the page visit event, and not the entry points to visit MinT wiki readers (which would be handled in a later A/B test)

That's right.

If i understand it correct, i should move the page visit event instrument code (code) into the Content Translation's MinT landing page ensure accurate tracking for the A/A test because there is no assigned group in the config yet?

I can't comment on where the code should live, but from what I have heard it should be WikimediaEvents (let me check with the Experiment Platform team). The A/A test will have both the treatment and control groups, it's just that, treatment won't receive any actual treatment i.e. they will just see that same thing as the control group. However, we'll track both the groups.

Once the A/A test is validated on production, is the plan to then implement feature toggling and clickthrough rate instrumentation for the A/B test through the entry points (article footer/language selector)?

Yes, that is correct. A/A will also help us to estimate how long should we run the test in production to get enough sample data.

If i understand it correct, i should move the page visit event instrument code (code) into the Content Translation's MinT landing page ensure accurate tracking for the A/A test because there is no assigned group in the config yet?

Hi!

You can write your experiment code in the WikimediaEvents extension or in your codebase. You can find this and some other guidance at https://wikitech.wikimedia.org/wiki/Experimentation_Lab/Conduct_an_experiment#Code. Take the opportunity to take a look at that documentation because you will find some interesting guidance, sample code and other interesting resources for your experiment.

And, of course, anyway, feel free to reach out to the Experiment Platform team. We will be really happy to help you!

Posting the A/A test and A/B test experiment flow here as well, and when a pagevisit event should be logged.

Nikerabbit changed the task status from Open to In Progress.Aug 12 2025, 7:21 AM
Nikerabbit raised the priority of this task from Medium to High.

Change #1172332 merged by jenkins-bot:

[mediawiki/extensions/WikimediaEvents@master] xLab: Add instrumentation for MinT readers

https://gerrit.wikimedia.org/r/1172332

Change #1179120 had a related patch set uploaded (by Huei Tan; author: Huei Tan):

[operations/mediawiki-config@master] Add Metrics Platform stream configuration and registration for MinT for Wikipedia Readers Page visit instrumentation for experiment by Language and Product Localization team.

https://gerrit.wikimedia.org/r/1179120

Change #1179120 merged by jenkins-bot:

[operations/mediawiki-config@master] MinT: Add stream configuration and registration

https://gerrit.wikimedia.org/r/1179120

Mentioned in SAL (#wikimedia-operations) [2025-08-20T07:06:31Z] <kartik@deploy1003> Started scap sync-world: Backport for [[gerrit:1179120|MinT: Add stream configuration and registration (T397600 T397043)]]

Mentioned in SAL (#wikimedia-operations) [2025-08-20T07:08:39Z] <kartik@deploy1003> kartik, hueitan: Backport for [[gerrit:1179120|MinT: Add stream configuration and registration (T397600 T397043)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2025-08-20T07:17:52Z] <kartik@deploy1003> Finished scap sync-world: Backport for [[gerrit:1179120|MinT: Add stream configuration and registration (T397600 T397043)]] (duration: 11m 21s)