Page MenuHomePhabricator

Evaluate AQS use of GrowthExperiments new impact dashboard
Closed, ResolvedPublic

Description

T222310: [EPIC] Positive reinforcement: New Impact Module involves showing aggregated pageview data to users so they can understand the positive impact they make on Wikipedia. This currently takes two forms:

  • a maintenance script (refreshUserImpactData.php) to cache pageview data of all articles the user has edited, for users who are in the target audience (users who have registered in the last two weeks, or registered in the last year and edited within the last two weeks)
  • on-demand fetching the same data on a cache miss when a user visits Special:Homepage or Special:Impact

Fetching happens via the PageViewInfo extension which has a per-page cache, and applies some resource limits (it favors returning incomplete data over making many AQS requests from a single PHP request). Due to how the code is structured, those limits are applied to the data fetched for a single user, but not in total, so we want to make sure we don't generate a volume of requests that causes stress on AQS.

Test plan: run /home/tgr/T323958-deploy-newimpact/refreshUserImpactData.php (the master version of the refreshUserImpactData.php maintenance script from the GrowthExperiments extension, which has some helpful flags not yet present in wmf.10) on mwmaint1002 without the --use-job-queue option, first on a random small wiki (let's say akwiki), then on a random medium wiki (let's say abwiki), then on the pilot wikis from smallest to largest (bnwiki, cswiki, arwiki, eswiki) and watch the stats. After the deployment on Thursday, re-run on the pilot wikis with the --use-job-queue flag.

For the general deployment plan, see T222310#deployment-plan.

Emergency switch:

  • Before the Thursday post-train backport window: just kill the script. All it does is cache warming, can be safely interrupted.
  • After the Thursday post-train backport window
    • To disable the maintenance script, set the $wgGERefreshUserImpactDataMaintenanceScriptEnabled MediaWiki config variable to false.
    • To disable on-demand data loading, set the $wgGEUseNewImpactModule MediaWiki config variable to false.

Monitoring:


Progress:

  • time mwscript /home/tgr/T323958-deploy-newimpact/refreshUserImpactData.php akwiki --registeredWithin=2week --hasEditsAtLeast=3 --force --verbose | tee refreshUserImpactData.akwiki.1.log
  • time mwscript /home/tgr/T323958-deploy-newimpact/refreshUserImpactData.php akwiki --registeredWithin 1year --editedWithin 2week --hasEditsAtLeast=3 --force --verbose | tee refreshUserImpactData.akwiki.2.log
  • time mwscript /home/tgr/T323958-deploy-newimpact/refreshUserImpactData.php abwiki --registeredWithin=2week --hasEditsAtLeast=3 --force --verbose | tee refreshUserImpactData.abwiki.1.log
  • time mwscript /home/tgr/T323958-deploy-newimpact/refreshUserImpactData.php abwiki --registeredWithin 1year --editedWithin 2week --hasEditsAtLeast=3 --force --verbose | tee refreshUserImpactData.abwiki.2.log
  • time mwscript /home/tgr/T323958-deploy-newimpact/refreshUserImpactData.php bnwiki --registeredWithin=2week --hasEditsAtLeast=3 --force --verbose | tee refreshUserImpactData.bnwiki.1.log
  • time mwscript /home/tgr/T323958-deploy-newimpact/refreshUserImpactData.php bnwiki --registeredWithin 1year --editedWithin 2week --hasEditsAtLeast=3 --force --verbose | tee refreshUserImpactData.bnwiki.2.log
  • time mwscript /home/tgr/T323958-deploy-newimpact/refreshUserImpactData.php cswiki --registeredWithin=2week --hasEditsAtLeast=3 --force --verbose | tee refreshUserImpactData.cswiki.1.log
  • time mwscript /home/tgr/T323958-deploy-newimpact/refreshUserImpactData.php cswiki --registeredWithin 1year --editedWithin 2week --hasEditsAtLeast=3 --force --verbose | tee refreshUserImpactData.cswiki.2.log
  • time mwscript /home/tgr/T323958-deploy-newimpact/refreshUserImpactData.php arwiki --registeredWithin=2week --hasEditsAtLeast=3 --force --verbose | tee refreshUserImpactData.arwiki.1.log
  • time mwscript /home/tgr/T323958-deploy-newimpact/refreshUserImpactData.php arwiki --registeredWithin 1year --editedWithin 2week --hasEditsAtLeast=3 --force --verbose | tee refreshUserImpactData.arwiki.2.log
  • time mwscript /home/tgr/T323958-deploy-newimpact/refreshUserImpactData.php eswiki --registeredWithin=2week --hasEditsAtLeast=3 --force --verbose | tee refreshUserImpactData.eswiki.1.log
  • time mwscript /home/tgr/T323958-deploy-newimpact/refreshUserImpactData.php eswiki --registeredWithin 1year --editedWithin 2week --hasEditsAtLeast=3 --force --verbose | tee refreshUserImpactData.eswiki.2.log

Event Timeline

Proposed test plan: run the maintenance script without the --use-job-queue option, first on a random small wiki (let's say akwiki), then on a random medium wiki (let's say abwiki), then on the pilot wikis from smallest to largest (bnwiki, cswiki, arwiki, eswiki) and watch the stats. Then re-run on the pilot wikis with the --use-job-queue flag.

For the non-pilot wikis, this would require adding a force flag.

Change 861716 had a related patch set uploaded (by Gergő Tisza; author: Gergő Tisza):

[mediawiki/extensions/GrowthExperiments@master] refreshUserImpactData.php: Add force and dry-run flags

https://gerrit.wikimedia.org/r/861716

For awareness:

  • link this task from the weekly train task
  • ping Data Engineering
  • ping Service Operations

Change 861803 had a related patch set uploaded (by Gergő Tisza; author: Gergő Tisza):

[mediawiki/extensions/GrowthExperiments@master] refreshUserImpactData.php: Add minimum edit filter

https://gerrit.wikimedia.org/r/861803

Change 861716 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] refreshUserImpactData.php: Add force and dry-run flags

https://gerrit.wikimedia.org/r/861716

Change 861803 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] refreshUserImpactData.php: Add minimum edit filter

https://gerrit.wikimedia.org/r/861803

Change 861817 had a related patch set uploaded (by Sergio Gimeno; author: Gergő Tisza):

[mediawiki/extensions/GrowthExperiments@wmf/1.40.0-wmf.10] refreshUserImpactData.php: Add force and dry-run flags

https://gerrit.wikimedia.org/r/861817

Change 861818 had a related patch set uploaded (by Sergio Gimeno; author: Gergő Tisza):

[mediawiki/extensions/GrowthExperiments@wmf/1.40.0-wmf.10] refreshUserImpactData.php: Add minimum edit filter

https://gerrit.wikimedia.org/r/861818

Change 861818 abandoned by Sergio Gimeno:

[mediawiki/extensions/GrowthExperiments@wmf/1.40.0-wmf.10] refreshUserImpactData.php: Add minimum edit filter

Reason:

https://gerrit.wikimedia.org/r/861818

Change 861817 abandoned by Sergio Gimeno:

[mediawiki/extensions/GrowthExperiments@wmf/1.40.0-wmf.10] refreshUserImpactData.php: Add force and dry-run flags

Reason:

https://gerrit.wikimedia.org/r/861817

Change 861838 had a related patch set uploaded (by Sergio Gimeno; author: Gergő Tisza):

[mediawiki/extensions/GrowthExperiments@wmf/1.40.0-wmf.12] refreshUserImpactData.php: Add minimum edit filter

https://gerrit.wikimedia.org/r/861838

Change 861468 had a related patch set uploaded (by Sergio Gimeno; author: Gergő Tisza):

[mediawiki/extensions/GrowthExperiments@wmf/1.40.0-wmf.12] refreshUserImpactData.php: Add force and dry-run flags

https://gerrit.wikimedia.org/r/861468

Change 861468 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@wmf/1.40.0-wmf.12] refreshUserImpactData.php: Add force and dry-run flags

https://gerrit.wikimedia.org/r/861468

Change 861838 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@wmf/1.40.0-wmf.12] refreshUserImpactData.php: Add minimum edit filter

https://gerrit.wikimedia.org/r/861838

Change 861964 had a related patch set uploaded (by Gergő Tisza; author: Gergő Tisza):

[operations/puppet@production] growthexperiments: Use min edit limit for user impact refresh

https://gerrit.wikimedia.org/r/861964

Dry run:

  • mwscript /home/tgr/T323958-deploy-newimpact/refreshUserImpactData.php akwiki --registeredWithin=2week --hasEditsAtLeast=3 --force --dry-run: 3 users
  • mwscript /home/tgr/T323958-deploy-newimpact/refreshUserImpactData.php akwiki --registeredWithin 1year --editedWithin 2week --hasEditsAtLeast=3 --force --dry-run: 2 users
  • mwscript /home/tgr/T323958-deploy-newimpact/refreshUserImpactData.php abwiki --registeredWithin=2week --hasEditsAtLeast=3 --force --dry-run: 0 users (?)
  • mwscript /home/tgr/T323958-deploy-newimpact/refreshUserImpactData.php abwiki --registeredWithin 1year --editedWithin 2week --hasEditsAtLeast=3 --force --dry-run: 0 users (?)
  • mwscript /home/tgr/T323958-deploy-newimpact/refreshUserImpactData.php bnwiki --registeredWithin=2week --hasEditsAtLeast=3 --force --dry-run: 79 users
  • mwscript /home/tgr/T323958-deploy-newimpact/refreshUserImpactData.php bnwiki --registeredWithin 1year --editedWithin 2week --hasEditsAtLeast=3 --force --dry-run: 175 users
  • mwscript /home/tgr/T323958-deploy-newimpact/refreshUserImpactData.php cswiki --registeredWithin=2week --hasEditsAtLeast=3 --force --dry-run: 103 users
  • mwscript /home/tgr/T323958-deploy-newimpact/refreshUserImpactData.php cswiki --registeredWithin 1year --editedWithin 2week --hasEditsAtLeast=3 --force --dry-run: 326 users
  • mwscript /home/tgr/T323958-deploy-newimpact/refreshUserImpactData.php arwiki --registeredWithin=2week --hasEditsAtLeast=3 --force --dry-run: 338 users
  • mwscript /home/tgr/T323958-deploy-newimpact/refreshUserImpactData.php arwiki --registeredWithin 1year --editedWithin 2week --hasEditsAtLeast=3 --force --dry-run: 744 users
  • mwscript /home/tgr/T323958-deploy-newimpact/refreshUserImpactData.php eswiki --registeredWithin=2week --hasEditsAtLeast=3 --force --dry-run: 976 users
  • mwscript /home/tgr/T323958-deploy-newimpact/refreshUserImpactData.php eswiki --registeredWithin 1year --editedWithin 2week --hasEditsAtLeast=3 --force --dry-run: 2424 users
Tgr updated the task description. (Show Details)

Mentioned in SAL (#wikimedia-operations) [2022-11-30T08:07:50Z] <tgr> dry-running GrowthExperiments refreshUserImpactData.php - T323958#8430768

Tgr changed the task status from Open to In Progress.Nov 30 2022, 11:02 PM
Tgr claimed this task.
Tgr triaged this task as High priority.
Tgr moved this task from Incoming to In Progress on the Growth-Team (Sprint 0 (Growth Team)) board.

Mentioned in SAL (#wikimedia-operations) [2022-11-30T23:06:04Z] <tgr> running GrowthExperiments refreshUserImpactData.php (and generating a bunch of AQS requests) for T323958

The users were processed at a quite steady 10/sec rate. (For the four pilot wikis together, that's something like 7-8 minutes per daily run.) The load graphs didn't visible budge and there were no errors in the logs. I spot-checked the pageview data for some users, and it looks reasonable.

The baseline traffic to the internal AQS pageviews endpoint is around 100 requests per second, for the external one it's similar but there are sustained spikes of 500+/s (also 1500/s 404 requests, not sure what's up with that), so if we guess 5 edited pages per user on average, that's 50/s extra requests - not negligible but also doesn't seem like it would cause a problem. I think it's safe to say we don't need to worry about pageviews, at least until rollout to significantly more / larger wikis.

Change 861964 merged by RLazarus:

[operations/puppet@production] growthexperiments: Use min edit limit for user impact refresh

https://gerrit.wikimedia.org/r/861964