Page MenuHomePhabricator

Cache newcomer tasks per user
Closed, ResolvedPublic

Description

Create a per-user cache of suggested edits tasks, populate it on user creation, and refresh it each time tasks are fetched (via visits to Special:Homepage or through toggling task/topic filters).

This would optimize the loading of Special:Homepage, and not the in-module interactions (toggling difficulty / topic filters). Improved performance when toggling filters will in part be handled by T242560.

The approach would be roughly:

  • user creates account and is opted into the experiment, in the same request we make a request to get a set of suggested edits (as if they had visited Special:Homepage and opted in to suggested edits, so we'd use the default values of copyedit+links and no topics) and cache it for that user for 30 days
  • when the user visits Special:Homepage, the code which fetches suggested edits first looks to see if there is data in the cache for the current set of filters (e.g. copyedit+links), and if so loads it, if not it asks ElasticSearch for it. After getting it from ElasticSearch, we cache it again for 30 days.
  • When the user requests tasks, use a DB query to validate that the relevant maintenance templates are still present in the cached tasks; remove tasks from the cache which no longer have the maintenance templates
  • on each change to the users task/topic filter preferences, recache the data

Event Timeline

Placing in Incoming for us to prioritize per the performance implications of T258021#6392589

kostajh updated the task description. (Show Details)

Change 623768 had a related patch set uploaded (by Kosta Harlan; owner: Kosta Harlan):
[mediawiki/extensions/GrowthExperiments@master] Cache newcomer tasks per user

https://gerrit.wikimedia.org/r/623768

This is in code review for us to discuss the proposed patch, we may decide not to do this. I figured a patch was easier to have discussion around and to illustrate what I'm proposing to do.

Change 623776 had a related patch set uploaded (by Kosta Harlan; owner: Kosta Harlan):
[mediawiki/extensions/GrowthExperiments@master] Newcomer tasks: Warm cache for variant C users

https://gerrit.wikimedia.org/r/623776

Change 625611 had a related patch set uploaded (by Kosta Harlan; owner: Kosta Harlan):
[mediawiki/extensions/GrowthExperiments@master] Task suggester: Add parameter for emptying cache

https://gerrit.wikimedia.org/r/625611

Change 625611 abandoned by Kosta Harlan:
[mediawiki/extensions/GrowthExperiments@master] Task suggester: Add parameter for emptying cache

Reason:
Not needed

https://gerrit.wikimedia.org/r/625611

Change 623768 merged by jenkins-bot:
[mediawiki/extensions/GrowthExperiments@master] Cache newcomer TaskSets per user

https://gerrit.wikimedia.org/r/623768

Change 623776 merged by jenkins-bot:
[mediawiki/extensions/GrowthExperiments@master] Newcomer tasks: Warm cache for variant C users

https://gerrit.wikimedia.org/r/623776

Catrope added subscribers: Tgr, Catrope.

@kostajh @Tgr Does anything need to be done here wrt filtering out protected pages from the cached result?

@kostajh @Tgr Does anything need to be done here wrt filtering out protected pages from the cached result?

No, in SuggestedEdits module, to generate the task preview, we get the result from the cache and then apply the protection filter to it. For the task queue used in the module, we ask for 250 tasks and then filter out protected ones on the client-side.

We should, however, add a database query that verifies that all the tasks in the TaskSet still have the relevant maintenance template present in the latest revision of the page. So I think we could move this task back into Ready for Development for that.

@kostajh @Tgr Does anything need to be done here wrt filtering out protected pages from the cached result?

No, in SuggestedEdits module, to generate the task preview, we get the result from the cache and then apply the protection filter to it. For the task queue used in the module, we ask for 250 tasks and then filter out protected ones on the client-side.

We should, however, add a database query that verifies that all the tasks in the TaskSet still have the relevant maintenance template present in the latest revision of the page. So I think we could move this task back into Ready for Development for that.

I'll get a patch up for that.

Change 628073 had a related patch set uploaded (by Kosta Harlan; owner: Kosta Harlan):
[mediawiki/extensions/GrowthExperiments@master] [WIP] Add TemplateFilter for validating cache contents

https://gerrit.wikimedia.org/r/628073

Change 628950 had a related patch set uploaded (by Gergő Tisza; owner: Gergő Tisza):
[mediawiki/extensions/GrowthExperiments@master] Make TemplateFilter internal to CacheDecorator

https://gerrit.wikimedia.org/r/628950

Change 628073 merged by jenkins-bot:
[mediawiki/extensions/GrowthExperiments@master] Add TemplateFilter for validating TaskSet

https://gerrit.wikimedia.org/r/628073

Change 628950 merged by jenkins-bot:
[mediawiki/extensions/GrowthExperiments@master] Limit caching to local tasks

https://gerrit.wikimedia.org/r/628950

Etonkovidova subscribed.

Checked in production - testwiki wmf.10.

When the user requests tasks, use a DB query to validate that the relevant maintenance templates are still present in the cached tasks; remove tasks from the cache which no longer have the maintenance templates

The following was checked

  • delete template
  • move page
  • protect page

The result: the page is immediately removed from the SE feed and, when the action is reverted - is added back.

Change 635765 had a related patch set uploaded (by Kosta Harlan; owner: Kosta Harlan):
[mediawiki/extensions/GrowthExperiments@master] Newcomer tasks: Schedule a cache refresh via job queue

https://gerrit.wikimedia.org/r/635765

Change 636078 had a related patch set uploaded (by Catrope; owner: Catrope):
[operations/deployment-charts@master] Add changeprop rules for newcomerTasksCacheRefreshJob

https://gerrit.wikimedia.org/r/636078

Change 636078 merged by jenkins-bot:
[operations/deployment-charts@master] Add changeprop rules for newcomerTasksCacheRefreshJob

https://gerrit.wikimedia.org/r/636078

Change 635765 merged by jenkins-bot:
[mediawiki/extensions/GrowthExperiments@master] Newcomer tasks: Schedule a cache refresh via job queue

https://gerrit.wikimedia.org/r/635765