We use a search query to generate a set of tasks, then cache it for up to a week, and revalidate when the user next asks for it. Currently, revalidation means checking whether the templates that were used to define those tasks (such as Template:Copyedit for copyediting tasks) are still present. In the future, as the search queries get more complex, we might need to revalidate much more things (whether the page is protected, whether it is in a bad category like articles nominated for deletion, whether it has recommendations etc). Each of these requires complex backend logic. It would be more sustainable if we could just revalidate by re-running the same search query and checking if the tasks still match it. This is doable if we create a search keyword to restrict the search to the set of pages which are in that task set.
Description
Details
Related Objects
Event Timeline
Change 646896 had a related patch set uploaded (by Gergő Tisza; owner: Gergő Tisza):
[mediawiki/extensions/CirrusSearch@master] Add PageIdFeature
Change 645788 had a related patch set uploaded (by Gergő Tisza; owner: Gergő Tisza):
[mediawiki/extensions/GrowthExperiments@master] Replace TemplateFilter with TaskSuggester::filter
Change 646896 merged by jenkins-bot:
[mediawiki/extensions/CirrusSearch@master] Add PageIdFeature
Change 655376 had a related patch set uploaded (by Gergő Tisza; owner: Gergő Tisza):
[mediawiki/extensions/GrowthExperiments@master] Make TaskSuggester::suggest() options easier to expand
Change 655377 had a related patch set uploaded (by Gergő Tisza; owner: Gergő Tisza):
[mediawiki/extensions/GrowthExperiments@master] Skip topics when revalidating
Change 655376 merged by jenkins-bot:
[mediawiki/extensions/GrowthExperiments@master] Make TaskSuggester::suggest() options easier to expand
Change 645788 merged by jenkins-bot:
[mediawiki/extensions/GrowthExperiments@master] Replace TemplateFilter with TaskSuggester::filter
Change 655377 merged by jenkins-bot:
[mediawiki/extensions/GrowthExperiments@master] Skip topics when revalidating
I tested this on test.wikipedia.org by:
- Going to Special:Homepage and limiting my filters to something that returned a manageable number of tasks (I did Architecture + References which yielded 13 tasks)
- I picked one of the tasks and removed the {{Citation Needed}} templates.
- Going back to the homepage, the server-side load of the page shows 13 tasks (since we fetch from the cache without any filtering) while the call to the API then returns 12 tasks (filtering applied)
Performance appears to be fine with this solution. The only downside is that there is a rare possibility that the first task (and only the first one) might have had its maintenance templates removed after the user's task set was cached. I think we can live with this.
Currently we set TTL_UNCACHEABLE but we could also set the TTL based on whether revalidation resulted in removing any tasks, so the server-side rendered results would only be incorrect for the first time.
For now, the potential task change after JS has loaded is behavior that exists regardless of this task, due to protection filtering happening outside of search. Once that's fixed and there is no need to provide a buffer (ie. 250 results when we are really just looking for 200), we could change SearchTaskSuggester to stop making queries as soon as it has hit the limit, and then a search with a limit of 1 will usually take a single query, at which point IMO we could revisit this.