User Details
- User Since
- Feb 8 2019, 4:51 AM (356 w, 2 d)
- Availability
- Available
- IRC Nick
- ppelberg
- LDAP User
- Unknown
- MediaWiki User
- PPelberg (WMF) [ Global Accounts ]
Fri, Dec 5
Wonderful. All that you described sounds great. Thank you, Nick.
Per offline discussion, with 1210320 being merged, all that's left to be done here is:
- 1. Finalize experiment details. See offline discussion.
- 2. Update https://mpic.wikimedia.org/create-experiment to reflect decisions made in "1."
Per offline discussions, we're going to move forward with the 130 px / 15% scroll approach.
Jotting down some notes from offline discussion with @Pablo and @Sucheta-Salgaonkar-WMF about the Language-agnostic Reference Risk model:
- The model is already available on LiftWing: https://api.wikimedia.org/wiki/Lift_Wing_API/Reference/Get_reference_risk_prediction#Anonymous_access.
- At present, the model accepts a revisionID and lang (e.g. https://en.wikipedia.org/w/index.php?title=Lux_(Rosal%C3%ADa_album)&oldid=1322859357) and outputs the following:
- ps_label_local: the status that web domain is the perennial source list of that wiki (if exists),
- ps_label_enwiki: the status that web domain in the English Wikipedia perennial source list (if exists),
- survival_ratio : using data from the model_version (currently 2024-11), the survival ratio of that web domain when used as a reference on that wiki (i.e., proportion of the number of edits the domain stayed on the page over the total number of edits since addition). Values range from 0 to 1 (the closer to 0, the riskier),
- page_count: using data from the model_version (currently 2024-11), in how many pages that web domain has been used a reference on that wiki,
- editors_count: using data from the model_version (currently 2024-11), how many editors have used that web domain as a reference on that wiki.
- The model does not use ML but applies pre-computed scores of references in each wiki. Scores for the 2024-06 version can be found at https://analytics.wikimedia.org/published/wmf-ml-models/reference-quality/reference-risk
- The model is language-agnostic
- The model has not been tested with volunteers yet, the heuristic of using the survival of references comes from these ML experiments we ran https://arxiv.org/abs/2410.18803
- The model is using pre-computed scores for any single source (data folder, e.g., these are scores for frwiki). Therefore, Edit Check or any other service interested in scores for a single source could already use them directly as well.
Thu, Dec 4
Wed, Dec 3
Having solidified the Experiment requirements in offline discussion with @MNeisler, this ticket is now ready for Editing Engineering to implement and Experimentation Platform to QA once patch(es) are ready.
Meta: "opening" this task as closing it depends on us making (and documenting) the following decisions...
Wed, Nov 26
Next step
@ppelberg: update task description to include contents of measurement plan
Thank you, Nick. Some edits reflected in the draft below...
Understood. Thank you for confirming, Julie. I've updated the task description to reflect the wikis we'd like to be included in the experiment.
Tue, Nov 25
Great spot. Yes, we do want logged-out users to be included in the experiment, if possible.
- There's a couple of approaches to how we limit the audience to newcomers (user with ≤100 cumulative edits)
- As mentioned in this thread, we could make everyone eligible for the experiment and then focus the analysis on logged-out users or registered users who have an edit count of 100 or fewer edits at first exposure. This would show the new edit full page button to all users entering section editing mode but we would limit our analytics to the target group (user with ≤100 cumulative edits).
- Limit the experiment and treatment exposure to newcomers (user with ≤100 cumulative edits) at first exposure so experienced editors entering the experiment never see the new full page edit button.
If the second approach [i] does not add complexity, I'd prefer we limit exposure of the feature to the people the intervention is expressly meant to impact.
@MNeisler: the proposal you shared in T410319#11402693 looks great to me.
