Page MenuHomePhabricator

Compile list of eligible users for the ReadingLists experiment
Closed, ResolvedPublic5 Estimated Story Points

Description

We will be setting a hidden preference for users eligible for the ReadingLists experiment on web (max 200000 users across wikis that are participating)

We will use a maintenance script (see T402231) to populate the hidden preference (and can use the userOptions.php script in core to clean this up after the experiment).

We need to compile a list of user ids as input for the script. This is probably best done with a Jupyter notebook since we need to query the readinglists tables. (on wikishared), watchlist and user tables. (user_touched, user_editcount). The user_touched and watchlist data are not available on toolforge or cloud superset since they are sensitive private data, so suggest the Jupyter approach.

The script takes a comma-separated list of user ids, but could work with one line per user_id (and separate file per wiki).

https://wikitech.wikimedia.org/wiki/Data_Platform/Systems/Jupyter

In addition to testwiki (and test2wiki), wikis participating in the initial experiment:

  • Arabic - arwiki
  • French - frwiki
  • Vietnamese - viwiki
  • Chinese - zhwiki
  • Indonesian - idwiki

We also plan to run the experiment on English Wikipedia, but that will be at a later time TBD.

Criteria:

  • zero edits
  • no existing reading lists
  • <= 2 watchlist items (or 0 watchlist items outside User and User_talk namespaces)
  • only registered users (e.g. exclude temp accounts)
  • user_touched <= 3 months

@jwang has done initial analysis of potential users eligible so we should work with her on this.

Event Timeline

aude updated the task description. (Show Details)
aude triaged this task as High priority.Sep 22 2025, 7:45 PM
aude set the point value for this task to 5.Sep 23 2025, 3:01 PM
aude moved this task from Needs refinement to Ready for sprint on the Reader Experience Team board.

Let's check with Amir if 200,000 is still the limit for how many users we can bucket into the experiment. If it's still indeed the limit, let's split our allotment of 200,000 between the two experiments:

  • 100,000 for enwiki (as this has the largest bucket of eligible users) experiment
  • 100,000 for the other experiment, split proportionally across the 5 target Wikis based on Jennifer's initial eligibility numbers in https://phabricator.wikimedia.org/T401972#11133033
    • arwiki: 17,000 (17%)
    • frwiki: 34,000 (34%)
    • viwiki: 4,000 (4%)
    • zhwiki: 35,000 (35%)
    • idwiki: 10,000 (10%)
HFan-WMF claimed this task.