Test principles
We will include all contributors in the A/B test who:
- Do not have a sticky preference set
- Have made <100 overall edits
- Are anonymous or registered users
Bucketing criteria
The first time we see an anonymous user, we should use client-side logic to randomize them into either bucket and then set a cookie so that every EditAttemptStep event sent from them includes the appropriate value in the “bucket” field. If we’re doing this for anonymous editors, we should do it the same way for registered users as well.
Storing persistent identifiers for bucketed anons
We have to run test and conduct analysis within 90 days of test start date. We could use IP address + user agent as the ID, or create and store our own (more robust but harder and probably not worth the effort).
Research questions*
Total number of completed edits: Do contributors in one test group complete more edits than contributors in the other test group?
Time to save an edit: Do contributors in one test group complete their edits more quickly than contributors in the other test group? This is a metric we would need to look at alongside other measures, like the size of the edits being made.
Editor retention: Are contributors in one test group more likely to come back to edit again than contributors in the other test group?
Edit quality: Are contributors’ edits in one test group more likely to be reverted than contributors’ edits in another test group?
Disruption: Are contributors in one test group switching between editing interfaces more often than contributors in another test group? (i.e. people fleeing back to wikitext.)
*See T221187
Blocking questions
- Is legal okay with us creating a persistent identifier for anonymous contributors (doing this this would anonymous contributors less anonymous?
- Whether we decide to store a persistent identifier based on IP + user agent or create and store our own requires a modification to EditAttemptStep: how should changes to EditAttemptStep be implemented?
Open questions
- Under what conditions are contributors' editing interface preferences set? See T221195#5201746
- How will we bucket users if we want to include contributors who are not logged in?
- What level of precision is appropriate for this A/B test?
- What wikis are being included in this A/B test? See T222803
- Confirm: we can distinguish between contributors who do and do not have a sticky editing interface preference set? Yes
- Confirm: what impact – in terms of scale – does including all contributors who do not have a sticky preference set on the number of edit sessions
- How long will the test last?
- How are we going to monitor the test? What will trigger us to interrupt the test?
- What are the range of actions we could take after the test concludes? What will determine which action(s) we take?
- What could we do to exclude editors from our edit completion rate numbers who we assume do not have any intention of completing an actual edit?
"Done"
- A phabricator task is created that specifies how engineering should implement/instrument the A/B test