Page MenuHomePhabricator

Run a query to prove or disprove our template discovery hypothesis
Closed, ResolvedPublic

Description

Run a query so that we can test the following hypothesis:

"If we introduce the ability for volunteers to add template favourites, then at least 1,000 contributors will favourite 1 template."

To query across the cluster, we'll likely use the wmfdata-python library on a stats machine.


Notes for debugging:

  • on test.wikipedia, my user ID is 9999
wikiadmin2023@10.64.32.82(testwiki)> select * from user_properties where up_user = 9999 and up_property = 'templatedata-favorite-templates' limit 2;
+---------+---------------------------------+---------------+
| up_user | up_property                     | up_value      |
+---------+---------------------------------+---------------+
|    9999 | templatedata-favorite-templates | [13450,90327] |
+---------+---------------------------------+---------------+

Event Timeline

hm, it seems mediawiki_user_properties from wmf_raw is still limited to the preferences exposed in the replicas :/ (i.e. not the one we're interested in)

Do we really want to monitor template-favouriting-use on a continuous basis? If not then a dashboard seems like overkill for this. We can test the hypothesis by querying the db, right? A dashboard that nobody is gonna look at seems like a waste

I think, it might be overkill only depending on how long it takes, I would say if it takes less than day it might be fine, as we want to explore template usage further.

Good point both

@KSiebert I'd say it would take longer than it's worth when a quick query would do just as well for confirming the hypothesis — I can close this task as not done?

@TheresNoTime I prefer to rewrite the task to write the actual query and confirm we completed the hypothesis.

TheresNoTime renamed this task from Create SuperSet dashboard for tracking template favouriting to Run a query to prove or disprove our template discovery hypothesis.May 6 2025, 1:38 PM
TheresNoTime updated the task description. (Show Details)

From discussion: If this runs every week/month, we can still graph something. We should start after deploying to the pilot wikis

@KSiebert @TheresNoTime we should modify the framing here. Instead of being at least 1k users having favorited a template, the hypothesis should be measuring for the number of people who have favorited at least 5 templates.

To get a full picture, we should know:

  • Number of users that open the dialog
  • Number of users that favorite at least 1 template
  • Number of users with 1+ templates favorited
  • Total number of favorites
  • Most popular favorited templates

And be able to calculate the average number of templates per user who opens the dialog.

Just so you're aware, that level of analytics will need event tracking set up (instead of a simple database query) — not an issue, just will take a bit longer to set up and get going

@TheresNoTime I don't want to distract you from other things now, but will we need support for instrumentation from other teams?

Just noting that an initial run shows that 65 editors have favourited 1 or more templates as of today

@TheresNoTime - Ideally we don't want to add event tracking right now; which of these metrics would require event tracking set up?

  • Number of users that open the dialog
  • Number of users that favorite at least 1 template
  • Number of users with 1+ templates favorited
  • Total number of favorites
  • Most popular favorited templates

Number of users that open the dialog

Event tracking

Number of users that favorite at least 1 template

Event tracking (if you want as-it-happens tracking, else the below query would answer practically the same question)

Number of users with 1+ templates favorited

Query

Total number of favorites

Query

Most popular favorited templates

Query ("expensive"/slow)

nb. from Slack: also query how many users have 5+ favourite templates

nb. from Slack: also query how many users have 5+ favourite templates

Total users with preference 'templatedata-favorite-templates': 461
Total users with preference 'templatedata-favorite-templates' having 5+ favorites: 44
KSiebert changed the task status from Open to In Progress.Jul 16 2025, 9:40 AM