Page MenuHomePhabricator

Special page to show statistics about newcomer task pool sizes
Closed, ResolvedPublic

Description

As mentioned in T276795: Monitoring for GrowthExperiments link recommendation task pool and T249987: Scale: GrowthExperiments wiki monitoring dashboard there is a desire to have an easily accessible overview of available tasks, by type and by topic.

The special page is Special:NewcomerTasksInfo

Event Timeline

Change 675085 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):
[mediawiki/extensions/GrowthExperiments@master] Add Special page for showing newcomer task pool sizes

https://gerrit.wikimedia.org/r/675085

Change 675085 merged by jenkins-bot:
[mediawiki/extensions/GrowthExperiments@master] Add Special page for showing newcomer task pool sizes

https://gerrit.wikimedia.org/r/675085

kostajh moved this task from Code Review to QA on the Growth-Team (Current Sprint) board.
kostajh updated the task description. (Show Details)
kostajh moved this task from March 29 - April 2 to Done / QA on the Add-Link board.

Checked in betalabs and in testwiki wmf.37 - Special:NewcomerTasksInfo.

(1) The total count of fetched articles by the topic should not be just a sum of the numbers for each difficulty filter. It's rather rare when Sum suggested articles for ("Art" topic) = Sum ("Art" topic for copyedit filter) + Sum ("Art" topic for links filter + ... ) A single article often belongs to several filters (one-to-many relationship)

Screen Shot 2021-04-05 at 6.42.15 PM.png (634×915 px, 111 KB)

(2) There is a miscalculation by one. Special:NewcomerTasksInfo lists the numbers for some filters that are "+1" more than the numbers displayed in the SE:

Screen Shot 2021-04-05 at 6.29.00 PM.png (587×925 px, 112 KB)

"Add links between artcilces" ask has a count of 55. The SE module will display 54, but the 55 count is noticeable when the results are loading.
SE_count3.gif (600×647 px, 65 KB)

@MMiller_WMF, @RHo - should the total count information on Special:NewcomerTasksInfo be exactly the same as in the SE module?

The SE module contains logic to filter out protected articles, which Special:NewcomerTasksInfo doesn't -- T259346: Add page protection filter to CirrusSearch is probably the best way to do this rather than re-implementing the logic every place that we want to query data about tasks. Maybe that explains the off by one discrepancy.

(1) The total count of fetched articles by the topic should not be just a sum of the numbers for each difficulty filter.

Not sure about that. If we have a copyedit task and a link task for the same page, aren't those two different tasks? We won't show them at the same time to the user, but if they solve one of them, they will still get the other.

In T278524#6974722, @Etonkovidova wrote:

(1) The total count of fetched articles by the topic should not be just a sum of the numbers for each difficulty filter.

@Tgr @Etonkovidova -- I think it would be best if there were two count columns, one that is the count of tasks and the other that is the count of distinct articles. If that's not possible, then I think it is fine to stay with count of tasks.

[...]
"Add links between artcilces" ask has a count of 55. The SE module will display 54, but the 55 count is noticeable when the results are loading.

SE_count3.gif (600×647 px, 65 KB)

@MMiller_WMF, @RHo - should the total count information on Special:NewcomerTasksInfo be exactly the same as in the SE module?

Yes, they should be the same if one of the purposes of this dashboard is to monitor the volume of articles newcomers are getting for each task type, we should ideally be seeing what the newcomers are shown. @kostajh - in your comment quoted below, are you suggesting this discreprancy will be fixed by T259346 as it will apply the same filtering logic on both SE module and this Special:NewcomerTasksInfo?

The SE module contains logic to filter out protected articles, which Special:NewcomerTasksInfo doesn't -- T259346: Add page protection filter to CirrusSearch is probably the best way to do this rather than re-implementing the logic every place that we want to query data about tasks. Maybe that explains the off by one discrepancy.

[...]
"Add links between artcilces" ask has a count of 55. The SE module will display 54, but the 55 count is noticeable when the results are loading.

SE_count3.gif (600×647 px, 65 KB)

@MMiller_WMF, @RHo - should the total count information on Special:NewcomerTasksInfo be exactly the same as in the SE module?

Yes, they should be the same if one of the purposes of this dashboard is to monitor the volume of articles newcomers are getting for each task type, we should ideally be seeing what the newcomers are shown. @kostajh - in your comment quoted below, are you suggesting this discreprancy will be fixed by T259346 as it will apply the same filtering logic on both SE module and this Special:NewcomerTasksInfo?

Yes, that's what I was suggesting although now that I look at it again, I see that the filtering logic is already in place (it's contained in the CacheDecorator which is called in the SuggestionsInfo class which is what powers Special:NewcomerTasksInfo).

The discrepancy could still be caused by the fact that Special:NewcomerTasksInfo caches its output for an hour, and the client-side code for the suggested edits module will do additional filtering of the result set. So it's possible that if a page protection status changes after being cached, we'd see a difference in numbers. If that's what is occurring in T278524#6974722, then I'd say we should just leave it as is; knowing that there are 53 tasks instead of 55 for up to 59 minutes before the cache is refreshed is not a big deal IMHO.

Change 679346 had a related patch set uploaded (by Gergő Tisza; author: Gergő Tisza):

[mediawiki/extensions/GrowthExperiments@master] SpecialNewcomerTasksInfo: add missing text()

https://gerrit.wikimedia.org/r/679346

Change 679346 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] SpecialNewcomerTasksInfo: add missing text()

https://gerrit.wikimedia.org/r/679346