Page MenuHomePhabricator

Newcomer tasks: rules for duplicate results
Closed, ResolvedPublic

Description

Now that we're using topics to serve tasks in the suggested edits module, and allowing users to select multiple topics and multiple task-types, we are showing articles for every time an article is a match for a topic. In other words, an article may be shown twice if it is both a match for the "sports" and "history" topics (and the user has both those topics selected). This means that the user can feel like they are getting duplicates in the feed, especially since there is no visual difference between articles fitting displayed multiple times for multiple topics.

This should not have been happening for articles that have multiple maintenance templates. That's because in T232423: Newcomer tasks: suggested edits module, we wrote these specifications:

  • The following list is the mapping of task types to difficulty levels.
    • Links = Easy
    • Copy edit = Easy
    • Update = Medium
    • References = Medium
    • Expand = Hard
  • If an article has multiple maintenance categories/templates, it should only be counted once in the list showed to the user. The list from the bullet above should also be used to show the order of preference of how the article is counted. For instance:
    • If an article has both the "Links" and "Copy edit" templates, and the user has selected the "Links" and "Copy edit" checkboxes, the user should see the article in the list once, as a "Links" article. But if they have only selected the "Copy edit" checkboxes, the user should see the article in the list once, as a "Copy edit" article.
    • If an article has both the "Copy edit" and "Update" templates, and the user has selected the "Copy edit" and "Update" checkboxes, the user should see the article in the list once, as a "Copy edit" article. But if they have only selected the "Links" and "Update" checkboxes, the user should see the article in the list once, as an "Update" article.

The ideal business rule here would be to show each article only once, no matter how many topics it fits.

Event Timeline

Tgr added a comment.Jan 22 2020, 1:22 AM

To avoid maintaining an extra configuration setting, I'll use the filter dialog order for precedence (ie. we choose copyedit over link, link over update and so on). It's easy to change, if maintaining a different precedence order is preferred.
For topics it will be similar: topics that come first in the JSON file and the filter dialog are preferred when a task is in two topics. (Having to deduplicate topics should be a temporary issue in any case.)

Change 566398 had a related patch set uploaded (by Gergő Tisza; owner: Gergő Tisza):
[mediawiki/extensions/GrowthExperiments@master] Suggested edits: Deduplicate search results

https://gerrit.wikimedia.org/r/566398

Tgr claimed this task.Jan 22 2020, 1:55 AM

Change 566398 merged by jenkins-bot:
[mediawiki/extensions/GrowthExperiments@master] Suggested edits: Deduplicate search results

https://gerrit.wikimedia.org/r/566398

Etonkovidova closed this task as Resolved.Jan 23 2020, 11:37 PM

Re-checked T242814 as part of testing this task - some discussion (possibly) is needed.
This task as Resolved.