Page MenuHomePhabricator

Create per-wiki user preference metrics
Open, Needs TriagePublicFeature

Description

As a software engineer who adds features to MediaWiki behind user preferences, I would like a way of querying metrics about the adoption of these features (i.e. "How many people have enabled this user preference on en.wikipedia"), so that I am able to quantify impact and get an idea of overall feature use without having to run queries such as SELECT COUNT(up_user) FROM user_properties WHERE up_property = 'editrecovery' AND up_value = 1;.

Example data questions

  • On the English Wikipedia, how many people have enabled Edit Recovery?

Privacy

User preferences (stored in the user_properties table) are redacted from the replicas for privacy reasons — this of course makes sense. This task requests the addition of 'high level' metrics, such as:

  1. Per wiki, per user preference, enabled count
  2. Per wiki, per user preference, GlobalPreference override count
  3. GlobalPreferences, per user preference, enabled count

The data could/should be kept private (i.e. only available on the analytics cluster).

Event Timeline

TheresNoTime changed the subtype of this task from "Task" to "Feature Request".Wed, Apr 3, 9:36 AM
TheresNoTime updated the task description. (Show Details)

This is a really great feature request @TheresNoTime!

@mpopov have thoughts on a simple way to accomplish this? Maybe via a Superset dashboard or something similar? Would love to hear your thoughts.

@VirginiaPoundstone: Howdy! The underlying dataset will probably be the hardest part of this because of the challenges of how user preferences are stored and used. And then yeah, a Superset dashboard would be the simplest way to make that data available to the end users. It wouldn't be through Turnilo because the metrics aren't additive across dimensions, so it would need to be Superset.

As for the dataset, some initial thoughts/concerns:

  • There's a lot of junk in user preferences (from various extensions and from users running preference setting code), so we'd need to maintain an allowlist of preference keys.
  • user_properties has this fun feature where "Only non-default settings are stored" (https://www.mediawiki.org/wiki/Manual:User_properties_table)
    • This is probably also the case for global_preferences but I don't know for sure because https://www.mediawiki.org/wiki/Extension:GlobalPreferences/global_preferences_table 's definition of documentation is a bare-bones schema dump.
    • We might need to know the default value for each allowlisted key, because if a preference is enabled by default then user_properties can only tell us how many have disabled it. I'm pretty sure default can vary wiki-by-wiki (@TheresNoTime can you please confirm?) meaning we'd be counting # of enables on one wiki and # of disables on another (if the preference is binary).
    • Not all preferences are binary. MediaWiki skin selection isn't, for example.
  • Then there's also the "Automatically enable most beta features" preference (can also be set globally and overridden locally on a wiki) which I'm not sure if it then enables those features (such as Discussion Tools) individually or their enable-ment is governed by the beta feature toggle – more investigation needed.

a simple way to accomplish this

I think there's a simple way to accomplish this but I don't think the end result would be particularly useful. I believe that for the end result to be useful "I want to know how many users have this feature enabled" this will need careful planning and consideration to account for the complexity of user preferences.

  • We might need to know the default value for each allowlisted key, because if a preference is enabled by default then user_properties can only tell us how many have disabled it. I'm pretty sure default can vary wiki-by-wiki (@TheresNoTime can you please confirm?) meaning we'd be counting # of enables on one wiki and # of disables on another (if the preference is binary).

Unfortunately the default can vary wiki-by-wiki (e.g. as done here)