Page MenuHomePhabricator

[betalabs] Suggested edits - the counter discrepancies
Closed, ResolvedPublicBUG REPORT

Description

(1) The discrepancy in the number of tasks if the tasks belong to several task types (the issue is also present in testwiki wmf.19):

  • Select Fashion as a topic
  • Go to Task type filter selection and select two task types - Copyedit + Add link (a structured task) - the count of tasks will be 8 (confirmed by Special:NewcomerTaskInfo).

Screen Shot 2022-07-08 at 2.13.19 PM.png (554×1 px, 94 KB)

The actual number of articles # of copyedit tasks=5 and # of add link tasks=2 (title: "Kabelka" and title: "Gigi Hadid" are in both Copyedit and Add link task types:

Array(5) [ {…}, {…}, {…}, {…}, {…} ]
​
0: Object { tasktype: "copyedit", difficulty: "easy", title: "Tetování", … }
​
1: Object { tasktype: "copyedit", difficulty: "easy", title: "Gigi Hadid", … }
​
2: Object { tasktype: "copyedit", difficulty: "easy", title: "Křížek (znak)", … }
​
3: Object { tasktype: "copyedit", difficulty: "easy", title: "Kabelka", … }
​
4: Object { tasktype: "copyedit", difficulty: "easy", title: "Denisa Domanská", … }
​
length: 5
​
Array [ {…}, {…} ]
​
0: Object { tasktype: "link-recommendation", difficulty: "easy", title: "Kabelka", … }
​
1: Object { tasktype: "link-recommendation", difficulty: "easy", title: "Gigi Hadid", … }
​
length: 2
  • Click Done to confirm the selection in the step above ( Fashion + Copyedit + Add link (a structured task))
  • the counter will display 5 tasks now to match the physical count of tasks.

(the gif illustrated the two steps above):

se_counter5.gif (678×1 px, 442 KB)

(2) Going through articles in the post-edit dialog will at some point display non-active navigational arrows and incorrect count of suggestions.

  • In the post-edit dialog click on the right arrow to see other suggested articles.
  • At some point (I see most cases after 19th suggestions) the message "End of 20 suggestions" will be displayed with with the back arrow button inactive, but the forward arrow button will be display as active.
Screen Shot 2022-07-08 at 1.24.20 PM.png (626×1 px, 197 KB)
Screen Shot 2022-07-08 at 1.22.24 PM.png (578×980 px, 78 KB)
  • Another case - both arrow buttons are inactive.

Screen Shot 2022-07-08 at 12.12.16 PM.png (1×1 px, 366 KB)

  • The number in the message - e.g. "End of 20 suggestions" or "End of 3 suggestions" will not match the initial number of suggestions.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Seems like the problematic code is in ApiQueryGrowthTasks:

// If we aborted because of $fits, $i is the 0-based index (relative to $offset) of which
// item we need to continue with in the next request, so we need to start with $offset + $i.
// If we finished (reached $limit) then $i points to the last task we successfully added.
if ( !$fits || $tasks->getTotalCount() > $offset + $i + 1 ) {
	// $i is 0-based and will point to the first record not added, so the offset must be one larger.
	$this->setContinueEnumParameter( 'offset', $offset + $i + (int)$fits );
}

The bug here is probably due to the addition of hasNext in rEGRE8183e0dc8787: Newcomer tasks: adjust the number of tasks shown in the taskfeed pager to match… which checks to see if continue is populated in the API result. @Tgr perhaps we can get rid of this code? I don't think we make use of 'offset' for querying in our client-side code. Or we could adjust the above code to not set the continue parameter when the number of tasks is lower than SearchTaskSuggester::DEFAULT_LIMIT?

Just removing this code will presumably break the frontend logic which expects the continue flag to convey some sort of information. The standard approach to this kind of thing is to fetch $limit + 1 tasks, and see if the +1 task was actually retrieved. Using $limit (same as SearchTaskSuggester::DEFAULT_LIMIT in practice but more flexible) works too - slightly less accurate (when there are exactly that many tasks) but simpler.

Just removing this code will presumably break the frontend logic which expects the continue flag to convey some sort of information. The standard approach to this kind of thing is to fetch $limit + 1 tasks, and see if the +1 task was actually retrieved. Using $limit (same as SearchTaskSuggester::DEFAULT_LIMIT in practice but more flexible) works too - slightly less accurate (when there are exactly that many tasks) but simpler.

I wonder if we should not rely on continue parameter at all given that we don't actually have the ability to page through results, due to our use of random sorting.

When we ask ElasticSearch for tasks, we get back a totalCount number that is then adjusted by 1) removing duplicates, 2) removing protected pages, 3) removing invalid link / image recommendations. We can keep track of the totalCount and subtract the pages that are duplicated, then on the client-side we can keep track of how many tasks we actually have in the task store. When the latter is equal to the adjusted totalCount, we know that there are no more tasks to fetch.

Sgs changed the task status from Open to In Progress.Jul 20 2022, 1:17 PM
Sgs claimed this task.

I wonder if we should not rely on continue parameter at all given that we don't actually have the ability to page through results, due to our use of random sorting.

In general I think there are three ways to handle this:

  1. Have the server indicate whether there are more results via some flag (whether that's continue or something else is an implementation detail).
  2. Make it the callers' responsibility to always request one more task than needed, and use that as the flag. This is the easiest in the short term (although I think CacheDecorator would have to be adjusted so it works well with caching), AIUI GrowthTasksApi.js does this now. In the long term, if we will use the API in various places, it would be nicer to centralize the logic on the server side.
  3. Accept that sometimes we end up in a situation where the frontend requests another batch of tasks and receives an empty array, and handle the UX implications of that (since we a "no more results" card, instead of just disabling the "next" button on the last card, which would be a common approach to paging, we can easily do this).

When we ask ElasticSearch for tasks, we get back a totalCount number that is then adjusted by 1) removing duplicates, 2) removing protected pages, 3) removing invalid link / image recommendations. We can keep track of the totalCount and subtract the pages that are duplicated, then on the client-side we can keep track of how many tasks we actually have in the task store. When the latter is equal to the adjusted totalCount, we know that there are no more tasks to fetch.

We cannot get an accurate total count while the results can still be paged (only on the last chunk), because deduplication and protection filtering cannot be done for the whole result set. So I don't see what value this would provide. If we just want to accurately know whether there is a next page, the easiest way is to request limit+1 tasks and see if the extra task is present. It's not entirely trivial because of how the limit is hardcoded in CacheDecorator, but still more straightforward IMO, and doesn't need to be updated whenever our filtering rules change.

Change 815913 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[mediawiki/extensions/GrowthExperiments@master] Post edit dialog: update tasks fetched accumulator based on API response

https://gerrit.wikimedia.org/r/815913

Change 815923 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[mediawiki/extensions/GrowthExperiments@master] GrowthTasksApi: avoid using continue from API response

https://gerrit.wikimedia.org/r/815923

Change 815913 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] Post edit dialog: update tasks fetched accumulator based on API response

https://gerrit.wikimedia.org/r/815913

Change 820719 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[mediawiki/extensions/GrowthExperiments@master] Newcomer tasks: dont emit update event until count is set

https://gerrit.wikimedia.org/r/820719

Change 820719 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] Newcomer tasks: dont emit update event until count is set

https://gerrit.wikimedia.org/r/820719

Change 815923 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] GrowthTasksApi: avoid using continue and increase look ahead tasks on fetch requests

https://gerrit.wikimedia.org/r/815923

Sgs changed the task status from In Progress to Open.Sep 12 2022, 5:48 PM
Sgs moved this task from Code Review to QA on the Growth-Team (Sprint 0 (Growth Team)) board.
Etonkovidova closed this task as Resolved.EditedOct 24 2022, 11:13 PM

Checked in testwiki wmf.6 - all works as expected.