Page MenuHomePhabricator

[Timebox 12hr] Investigation: detecting which drafts are submitted and awaiting review by AfC
Closed, ResolvedPublic

Description

In preparation for T193782, we need to think through what it will take distinguish "submitted drafts awaiting review" from all drafts so that they can be displayed as such in the New Pages Feed interface.

The background is that new users create pages in the draft namespace, but then go through a separate step to "submit" them for review by Articles for Creation (AfC). This step applies a template and category to the draft. Those submitted drafts await review by an AfC reviewer, who can accept them and move them to the main namespace, or can decline them and send them back to the author for improvement and resubmission.

One additional element is that we want to prevent multiple AfC reviewers from reviewing the same draft at the same time. The gadget that most AfC reviewers use, AFCH script, allows users to mark drafts as "under review", which could be used to prevent other reviewers from selecting that draft.

Therefore, there are two main user stories here are:

  • As a reviewer, I need to be able to filter to only those drafts that have been submitted to AfC and are awaiting review. This would include drafts that are awaiting their second, third, etc. review, but it would exclude drafts that have been submitted for review, have already been reviewed, and awaiting resubmission by their authors.
  • As a reviewer, I need to not accidentally attempt to review a draft already under review by another reviewer.

Some technical considerations that have been brought up so far in discussion of these user stories (though there are likely many more):

  • What work will it take to identify which drafts have the status of "submitted and awaiting review"?
  • Will we be able to exclude drafts under review from being present in the New Pages Feed list? Review typically takes only a couple minutes, so this type of exclusion would essentially need to be realtime.

Deliverables
  • Create list of Phab tickets for a rough implementation plan
  • Identify & document any dependencies and risks
  • Answer technical consideration questions above.

Investigation:

The {{AFC submission}} template is put on drafts (in any namespace but usually Draft), and this puts the pages into Category:Pending AfC submissions.

They also get put into other categories in addition (but these do not need to be examined as they don't contain pages that are not in the above category):

  • Pending AfC submissions in article space
  • Pending AfC submissions in userspace‎
  • Pending template and disambiguation AfC submissions‎

Some pages will also be in Category:Pending AfC submissions being reviewed now, and these pages should *not* be added to the queue.

We currently add to the page triage queue when a page is created (PageContentInsertComplete), moved (SpecialMovepageAfterMove), or edited (NewRevisionFromEditComplete), and add it if it's in one of the permitted namespaces (a some other checks). The namespaces are defined by $wgPageTriageNamespaces and are currently main (0) and User (2); we'd add Draft (118). There's no need to worry about NS IDs outside English Wikipedia.

This discussion has been around adding the Drafts NS, but perhaps what's actually needed is a 'Pending AfC submissions' filter? i.e. add to the list of 'Show' checkboxes in the filter a new 'Submitted pages'. (As well as an extra filter for NS, but that is not as crucial and could even be left out.)

How about this?— We add two new config variables: $wgPageTriageSubmissionsCat and $wgPageTriageSubmissionsInProgressCat, and hook on CategoryAfterPageAdded and CategoryAfterPageRemoved to add or remove pages from the traige queue when they're in the appropriate combination of categories. This would mean that pages wouldn't appear in NewPagesFeed while they're under review (although, is it meant to be able to do that live? that'd be a separate thing).

As for storing the status of the page, pages are tagged with some number of (pre-defined, boolean) tags. We'd add a new tag afc_status which would be toggled as required, and with which we could query the relevant pages.

There's certainly stuff that I'm not familiar with in the whole process, so please tell me where I've gone wrong in the above!

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMay 4 2018, 1:13 AM
MMiller_WMF removed MMiller_WMF as the assignee of this task.May 4 2018, 1:14 AM
TBolliger updated the task description. (Show Details)May 8 2018, 10:58 PM
TBolliger renamed this task from Investigation: detecting which drafts are submitted and awaiting review by AfC to [Timebox 12hr] Investigation: detecting which drafts are submitted and awaiting review by AfC.May 8 2018, 11:31 PM
TBolliger updated the task description. (Show Details)
Samwilson claimed this task.May 9 2018, 2:43 AM
Samwilson moved this task from Ready to In Development on the Community-Tech-Sprint board.
Samwilson updated the task description. (Show Details)May 9 2018, 6:54 AM

@Samwilson @MusikAnimal -- something I was thinking about that I wanted to bring up is about sorting by date. Right now, New Pages Feed allows sorting by "Newest" and "Oldest", which I believe is the date of creation of the page.

For AfC review, the reviewer's preference would probably be to sort by submission date when reviewing submitted drafts. For instance, a draft could be created in January, submitted in February, reviewed and declined in March, and then submitted again in April. The reviewer would probably consider that draft to be "April", not "January".

What are your thoughts on the possibilities here?

@MMiller_WMF That'll be hard! Figuring out when a category is added or removed from an article is a difficult problem. The databases or APIs aren't any help there.

Figuring out when a category is added or removed from an article is a difficult problem

We have categorylinks.cl_timestamp for that.

Figuring out when a category is added or removed from an article is a difficult problem

We have categorylinks.cl_timestamp for that.

Ooh, I didn't know of that. That should give us the latest timestamp for when a category was added, I believe? That's cool.

I think we still need to duplicate it in pagetraige_page_tags, maybe? It seems this is the mechanism that allows you to very quickly sort and scroll through the feed. I wonder what kind of performance impact there will be with all those extra rows from these new features.

MMiller_WMF closed this task as Resolved.Jun 1 2018, 5:09 PM

This initial investigation is complete and the work is now happening on T195545 and T195924. Please follow along there.

Vvjjkkii renamed this task from [Timebox 12hr] Investigation: detecting which drafts are submitted and awaiting review by AfC to andaaaaaaa.Jul 1 2018, 1:11 AM
Vvjjkkii reopened this task as Open.
Vvjjkkii removed Samwilson as the assignee of this task.
Vvjjkkii triaged this task as High priority.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed a subscriber: Aklapper.
CommunityTechBot renamed this task from andaaaaaaa to [Timebox 12hr] Investigation: detecting which drafts are submitted and awaiting review by AfC.Jul 2 2018, 4:30 PM
CommunityTechBot closed this task as Resolved.
CommunityTechBot assigned this task to Samwilson.
CommunityTechBot raised the priority of this task from High to Needs Triage.
CommunityTechBot updated the task description. (Show Details)
CommunityTechBot added a subscriber: Aklapper.