Using statistics, Bayesian inference, machine learning, and software/data engineering to solve problems and inform decisions in Product Analytics and improve product experimentation capabilities with Experiment Platform
User Details
- User Since
- Jul 27 2015, 4:15 PM (550 w, 1 d)
- Availability
- Available
- IRC Nick
- bearloga
- LDAP User
- Bearloga
- MediaWiki User
- MPopov (WMF) [ Global Accounts ]
Yesterday
@cjming: By the way, the mechanism here doesn't have to be super complicated or "perfect". We just need a way of avoiding unnecessarily logging exposure if/when we can.
(Sounds good!)
- Notable observations
- v0.1 (initial release)
Mon, Feb 9
@MoritzMuehlenhoff: Oh that's a great point. Yes, the outcome is that @AKhatun_WMF would have admin rights on the analytics_product Airflow instance. If getting that outcome requires the membership you mentioned, please make the necessary change.
Thanks! I copied that into the acceptance criteria and updated exp_platform to experiment_platform
Fri, Feb 6
SuggestedEdits::isActivated( $this->getContext()->getUser()
seems more accurate for action: experiment_exposure event than
SuggestedEdits::isEnabledForAnyone( $this->wikiConfig )
We want the exposure event to only fire if the suggested edits module has been activated and contains either the control or the treatment experience. We don't want exposure event on a homepage visit where the module hasn't been activated and so there is no actual exposure.
As the approving party for both groups (and the person requesting this access), I approve @AKhatun_WMF's membership.
Thu, Feb 5
There is still some nuance here about what exactly counts as an exposure. But based on Kirsten's general support above, I'll go for the users visiting the homepage when SuggestedEdits are enabled for the wiki. Please let me know if that is not sufficiently precise enough!
Wed, Feb 4
@JVanderhoop-WMF's thoughts:
I think the use case here is really specific: only some folks in the treatment group created reading lists, and they want to ensure that those folks are told that it's now a beta feature, that their list lives on, etc.
This seems rare, and I am concerned about a slippery slope of providing more "individual" information rather than the aggregates that matter in our A/B testing context. (Though I'm not clear that I've articulated that feeling all too clearly here)
I think exposure logging guidance probably warrants its own guide/page which is then linked to from the conduct an experiment guide.
@MNeisler and I discussed this and arrived at the following proposal:
Thank you!
Mon, Feb 2
Some additional data to assist with investigation: https://docs.google.com/spreadsheets/d/1P3_8tGbg3Suvfa1q8TrgAXv_jk29_z7cfkOenXDuQYE/edit?gid=1285859969#gid=1285859969 (WMF internal only)
| wiki_id | assigned | subject_count | |
|---|---|---|---|
| 0 | arwiki | control | 787 |
| 1 | arwiki | treatment | 774 |
| 2 | enwiki | control | 34215 |
| 3 | enwiki | treatment | 36732 |
| 4 | frwiki | control | 4875 |
| 5 | frwiki | treatment | 5060 |
| 6 | ptwiki | control | 1497 |
| 7 | ptwiki | treatment | 1601 |
Fri, Jan 30
@KReid-WMF @phuedx: I think Reader Growth has a pretty good pattern/practice that we can recommend in the docs as part of T414735: Update documentation with guidance on exposure logging.
Do we have people waiting to run experiments with these cases?
All the key metrics are now being analyzed and updated until the experiment concludes.
Thu, Jan 29
Wed, Jan 28
Confirmed, testwiki is not one of the target wikis.
Tue, Jan 27
All of the original metrics are now being analyzed by the automated analytics system.
Mon, Jan 26
sudo -u analytics-product kerberos-run-command analytics-product spark3-sql -e "DELETE FROM wmf_experiments.experiments_registry_v1 WHERE machine_name = 'growthexperiments-revise-tone';"
Fri, Jan 23
Or maybe just inserting an empty div (with experiment ID and variation ID embedded into class or as data attributes) whenever we want to indicate exposure from server-side?
Updated task description and AC to be about desired outcome, rather than the solution.
@KReid-WMF asked
Different design of or text copy on Special:CreateAccount
I think I see why we can't do this from js - because there's no js code change associated with the experiment. Is that right? If so, why not make it a requirement to add a js listener to page load that sends an event instead of asking the server to tell the js?
While discussing this with @phuedx and @KReid-WMF, it wasn't clear whether this is the right way to go and that motivating examples / use cases would help us arrive at the best solution to the underlying problem.
Thu, Jan 22
Tue, Jan 20
$ sudo -u analytics-product kerberos-run-command analytics-product bash $ source conda-analytics-activate base $ conda activate .conda/envs/2025-12-12T18.41.10_bearloga $ mkdir /tmp/section_editing_reanalysis $ cd /tmp/section_editing_reanalysis $ cp /home/bearloga/T415129-editing-ab-test-reanalysis.py ./ $ ipython
For transparency, the full list of wikis that are included in re-analysis as a result of this condition is:
Updated experiments.py in a clone of the jobs repo to exclude non-Wikipedia wikis during analysis:
Awesome, great work! I just verified and merged the MR. Thank you for investigating, thinking about this in depth, and providing a fix.
Fri, Jan 16
Thank you so much for investigating that and proposing a short term solution! Once I added the partition pushdown to (1) experiment assignment queries and (2) all the fact tables, I saw huge performance gains – experiment analysis that was previously DNF at 8 minutes finished in under 2 minutes! It also helped with the "obtaining possible values of dimensions from last X days of traffic data" feature.