Page MenuHomePhabricator

Deploy Extension:PersonalDashboard to English Wikipedia
Closed, ResolvedPublic1 Estimated Story Points

Description

This would be a silent rollout - it should be configured to not display anything to users.

We plan to deploy the PersonalDashboard extension to some pilot wikis (T417665) to learn how useful it is to novice moderators as a way to introduce them to patrolling. We would like to add English Wikipedia to the list of wikis receiving this software, so that we can learn from a larger pool of editors.

English Wikipedia is an 'easy' wiki for us to deploy to, because we don't need to spend time getting strings translated and updated as we iterate on this experience.

This depends on T418283: Add a PersonalDashboard configuration for upper bound on number of edits to invite users so that we only invite potential novice moderators, and don't bother our experienced editors who are unlikely to find the Dashboard valuable in its current form. We can revisit this in the future.

Related Objects

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

@MNeisler We'd like to re-run the analysis you did in T402802 / T412970 to learn how many users in the 100-500 edit count range might receive this invite during a 2 or 4 week period on English Wikipedia.

@MNeisler We'd like to re-run the analysis you did in T402802 / T412970 to learn how many users in the 100-500 edit count range might receive this invite during a 2 or 4 week period on English Wikipedia.

Here are the estimated number of editors who might receive this invite at English Wikipedia.

  • 9,987 Users Over 2-week timeframe
  • 15,905 Users Over 4-week timeframe

This is based on the following criteria.

  • Made at least one main namespace edit in the last 2 or 4 weeks.
  • Has between 100-500 cumulative edits

If we wanted to limit the upper edit count threshold to 300 edits, then that would reduce the number of editors to the following:

  • 7,162 Users Over 2-week timeframe
  • 11,681 Users Over 4-week timeframe

Either one of these options will add a sufficient number of distinct users to exceed the target identified in T412970 to establish baselines (assuming there are no significant decreases in the number of unique English Wikipedia editors in March). I'll provide more details on some usage milestone targets to monitor once the dashboard is deployed (Ticket pending).

cc @Samwalton9-WMF

@Samwalton9-WMF enwiki note: we don't have a sample rate for the instrumentation, nor do we have the revert risk filter deployed.

@Samwalton9-WMF enwiki note: we don't have a sample rate for the instrumentation, nor do we have the revert risk filter deployed.

@MNeisler We're concerned that the instrumentation on the Dashboard menu link is going to generate too much data on English Wikipedia, since the impression will be every page load. What might be a sensible sample rate be for that? (We'll leave the Dashboard click instrumentation alone since it should be much lower volume).

And for revert risk - let's use the damaging ORES model instead, which is deployed here. We can use the 'likely have problems' threshold (the lower of the two).

Samwalton9-WMF changed the task status from Open to Stalled.Mar 3 2026, 4:11 PM

@MNeisler We're concerned that the instrumentation on the Dashboard menu link is going to generate too much data on English Wikipedia, since the impression will be every page load. What might be a sensible sample rate be for that? (We'll leave the Dashboard click instrumentation alone since it should be much lower volume).

Currently, TestKitchen guidance suggests a sampling rate of 0.1% on English Wikipedia. This is specifically the enrollment sampling rate thresholds when instrumenting an AB test using edge uniques so based on number of unique devices ("The maximum traffic allocation on English Wikipedia is 0.1% – which, given the volume of reader traffic, would still be approximately 33K unique devices/day.")

If we're logging an impression every page load, we could likely go as low as 0.1% and achieve a sufficient sample size, but have some flexibility if we want to increase that rate since we're not running an AB test. I'd recommend not exceeding 5% though given high Enwiki traffic rates.

Thanks! This would only be for logged-in users in that 100 - 500 edit range (I misspoke in implying it was every reader page load), so I assume we should go higher than 0.1%?

got it - thanks for clarifying! Yes in that case, we shoud definitely sample at higher rate. I'd recommend a 10% sampling rate, which we've used for general engagement metrics on logged-in editors at English Wikipedia in the past.

A 10% sampling rate should still result in a sufficient sample size for analyzing engagement with the Dashboard menu link given the estimates identified above and seems unlikely to generate too much data.

(We'll leave the Dashboard click instrumentation alone since it should be much lower volume).

Just to confirm, are we leaving the pageview instrumentation (visits to the Dashboard) alone as well?

(We'll leave the Dashboard click instrumentation alone since it should be much lower volume).

Just to confirm, are we leaving the pageview instrumentation (visits to the Dashboard) alone as well?

Yes, I think so. I suspect we should get a more manageable amount of data about content on the Dashboard itself, since those events will only be firing once users are actually on the Special page?

Samwalton9-WMF changed the task status from Stalled to Open.Mar 17 2026, 4:18 PM

T418365 and T418875 are merged, so we could go ahead with a silent rollout on this to test functionality.

Samwalton9-WMF raised the priority of this task from Medium to High.Mar 17 2026, 4:19 PM
Samwalton9-WMF moved this task from To be estimated to Estimated on the Moderator-Tools-Team board.
Samwalton9-WMF set the point value for this task to 1.
Samwalton9-WMF moved this task from Estimated to Kanban on the Moderator-Tools-Team board.

Change #1254865 had a related patch set uploaded (by Kgraessle; author: Kgraessle):

[operations/mediawiki-config@master] Deploy Extension:PersonalDashboard to English Wikipedia

https://gerrit.wikimedia.org/r/1254865

Change #1254865 merged by jenkins-bot:

[operations/mediawiki-config@master] Deploy Extension:PersonalDashboard to English Wikipedia

https://gerrit.wikimedia.org/r/1254865

Mentioned in SAL (#wikimedia-operations) [2026-03-19T20:25:44Z] <kgraessle@deploy2002> Started scap sync-world: Backport for [[gerrit:1254865|Deploy Extension:PersonalDashboard to English Wikipedia (T418367)]], [[gerrit:1253654|Deploy PRV to 13 wikis (T420273)]]

Mentioned in SAL (#wikimedia-operations) [2026-03-19T20:27:41Z] <kgraessle@deploy2002> kgraessle, arlolra: Backport for [[gerrit:1254865|Deploy Extension:PersonalDashboard to English Wikipedia (T418367)]], [[gerrit:1253654|Deploy PRV to 13 wikis (T420273)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2026-03-19T20:36:44Z] <kgraessle@deploy2002> Finished scap sync-world: Backport for [[gerrit:1254865|Deploy Extension:PersonalDashboard to English Wikipedia (T418367)]], [[gerrit:1253654|Deploy PRV to 13 wikis (T420273)]] (duration: 11m 00s)

Hi @MNeisler, question for you now that we've turned on enwiki with a 10% sampling rate.
How will this impact retention rate measurements? For example, if we're only sampling 10% but miss a second visit (if the user wasn't sampled during the second visit) then our retention rates will incorrectly look like they are lower?
Not sure if you have already addressed this or not, but just curious what you recommend?

@Kgraessle @Samwalton9-WMF @MNeisler: Hi! Okay I think the source of challenge here is treating/bundling different metrics and product surfaces into a single data collection activity.

Recommendation: Split personal-dashboard-health-metrics into 2 separately managed/configured data collection activities:

  • personal-dashboard-entry-points (just impressions & clicks on the link)
    • Sampling:
      • Unit: pageview
      • Rate: 100% on low traffic wikis, start with 10% on enwiki with potential to increase (more on this below)
    • Contextual attributes:
      • Don't even need performer_id/performer_name for this one, just have it be a simple total clicks / total impressions.
      • Any other contextual attributes (mediawiki_database, performer_edit_count_bucket) would be for dimensional breakdown of the CTR only to assess how CTR varies across different segments of users
    • If you want you can treat all entry points as a single data collection activity or make them be separate ones, e.g. personal-dashboard-entry-point-top-link as separate instrument / data collection activity from personal-dashboard-entry-point-sidebar
  • personal-dashboard-usage (the actual dashboard)
    • Sampling: default 100%

Regarding sampling rate on enwiki – a few weeks ago we (Experiment Platform) collected some data on traffic patterns, sampling 0.1% of edge uniques on enwiki. After filtering the hourly traffic down to permanent & temporary users only and back-multiplying by 1000, we're looking at <400K pageviews per hour from logged in users. If the link is shown to logged-in users with 100-500 edit counts only, you might be fine without any sampling on enwiki.

A few months ago we (the general we) accidentally ran an instrument that ran on 100% of //all// traffic (on all wikis) and things //were mostly fine//. So, given the targeting conditions of the entry point, you might be fine with 100% and can always dial it back if needed.