Maniphest T193807

[Timebox 12hr] Investigation: using ORES for page review prioritization
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	MMiller_WMF
	May 3 2018, 9:19 PM

Description

In preparation for T193782, we need to think through using ORES models for scoring pages for review via AfC and NPP, specifically in the New Pages Feed interface.

The two models that we'll want to use are:

draftquality: predicts if the page will need to be speedy deleted (spam, vandalism, attack, or OK).
wp10: predicts the likely quality level of the page (Stub, Start, C, B, GA, FA).

The following are relevant user stories that we should plan for, as of 2018-05-03:

As a reviewer, I need to be able to filter the New Pages Feed by the four categories in draftquality
As a reviewer, I need to be able to filter the New Pages Feed by the six categories in wp10.
As a reviewer, I need a page's draftquality category and wp10 category to be displayed with its entry in the New Pages Feed list.
As a reviewer, I need all pages in all namespaces listed in the New Pages Feed to be filterable with those models (although the reviewer will only see pages from one namespace at a time).
As a reviewer, I need models to be up-to-date with the latest revision of a page at all times.

We have decided that we no longer have the following user story, because inside a category, these scores would not be as useful as sorting by date:

As a reviewer, I need to be able to sort the New Pages Feed by the scores underlying the categories of draftquality and wp10.

Some technical considerations that been brought up so far in discussion of these user stories (though there are likely many more):

Will it be difficult to re-score these models with every new revision of each page?
What will happen when we score these models on the User namespace, which is currently accessible in the New Pages Feed?

Deliverables

Create list of Phab tickets for a rough implementation plan
Identify & document any dependencies and risks
Answer technical consideration questions above.

Related Objects
Search...

Status	Assigned	Task
Resolved	MMiller_WMF	T193782 [EPIC] Prioritization tools for AfC and NPP
Resolved	MMiller_WMF	T196178 New Pages Feed: ORES addition
Resolved	MusikAnimal	T193807 [Timebox 12hr] Investigation: using ORES for page review prioritization

Event Timeline

MMiller_WMF created this task.May 3 2018, 9:19 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMay 3 2018, 9:19 PM

MMiller_WMF updated the task description. (Show Details)May 3 2018, 9:23 PM

Niharika moved this task from New & TBD Tickets to Needs Discussion on the Community-Tech board.May 8 2018, 9:24 PM

• TBolliger updated the task description. (Show Details)May 8 2018, 10:59 PM

• TBolliger renamed this task from Investigation: using ORES for page review prioritization to [Timebox 12hr] Investigation: using ORES for page review prioritization.May 8 2018, 11:31 PM

• TBolliger updated the task description. (Show Details)

• TBolliger edited projects, added Community-Tech-Sprint, English-Wikipedia-New-Pages-Patrol; removed Community-Tech.May 8 2018, 11:35 PM

MusikAnimal claimed this task.May 10 2018, 7:22 PM

MusikAnimal moved this task from Ready to In Development on the Community-Tech-Sprint board.

We first need new tags for draftquality and wp10. This would be as simple as creating an update script to insert the new rows into pagetriage_tags. Then we store the actual ORES values in the pagetriage_page_tags table. The ORES API is easy to work with, and I don't think there are any additional dependencies in order to do this.

We can use the onNewRevisionFromEditComplete to update the ORES data on each edit. Currently we're only using this hook to detect articles created from a redirect, which is somewhat cheap to do. Running the revision through ORES is probably a bit more expensive:

Will it be difficult to re-score these models with every new revision of each page?

I think we need ORES people like @Halfak or someone from performance to tell us whether checking each and every revision, as they are made, is a good idea. Each query can take several seconds to complete, so I have my doubts we can feasibly do this at a high rate. Another option I suppose is to throttle the ORES lookups, so that we only do it once for the same article within say, a 10 minute timeframe, and not for each and every edit. Or, we could use a cronjob that runs every 10 minutes and updates ORES scores for any pages that were edited in the past 10 minutes (the last time the cron ran). The latter sounds easier to me.

I need all pages listed in the New Pages Feed to be filterable with those models, regardless of namespace.

Currently we can only show one namespace at a time, but I don't think it will be a problem to add the option "All" to the namespace dropdown. Hopefully it will be clear this only includes Article, Draft and User namespaces. The filtering otherwise should be easy to do, with two new dropdowns for "Draft quality" and "Assessment".

I need a page's draftquality category and wp10 category to be displayed with its entry in the New Pages Feed list.

Technically it's no problem. For draft quality, I'd suggest only showing the overall prediction and it's score, so "OK (65%)" or "attack (95%)". Similar for wp10, with "Stub (85%)" or "FA (60%)" (as if there were ever any newly created FAs :). This would keep it concise, and should be able to fit nicely within the "55 bytes · 1 edit · 1 category" region.

I need to be able to sort the New Pages Feed by the scores underlying the categories of draftquality and wp10.

I don't see why we couldn't do this, technically. We'd add two new options at the "Sort by:" list at the top-right, "Draft quality" and "WP10", each with a dropdown.

What will happen when we score these models on the User namespace, which is currently accessible in the New Pages Feed?

I probably wouldn't do this. Yes, there are drafts created in the userspace, but there are probably just as many "Hi my name is Bob" user pages. I did some quick testing, and it seems like ORES does a good job of detecting that these are at least good-faith, but the scoring in general probably isn't that helpful for the purposes of reviewing. Maybe we could do it only for submitted drafts in the userspace (which by the way are often moved by the reviewer to Draft)? I just want to weigh out the benefit of scoring the userspace versus performance impact.

Aside from the above, I think the remaining user stories can be fulfilled. I will wait for further feedback before creating any new tickets.

@MusikAnimal -- thanks for the investigating so far. Though some of what you said is beyond my technical depth, I have some responses and clarifications:

For the story that reads "I need all pages listed in the New Pages Feed to be filterable with those models, regardless of namespace." -- I actually just meant that the feed should be filterable by those models no matter which namespace you've selected (i.e. we score all the relevant namespaces with the models). We don't ever want all the namespaces to be in the same list at the same time, because that would make it easy for a reviewer to accidentally apply the wrong criteria to a page they're reviewing. I'll clarify this in the description.
I like the idea of listing the score in parentheses next to the category (@alexhollender).
We've since decided we don't actually have this story anymore, and I'll clarify in the description: "I need to be able to sort the New Pages Feed by the scores underlying the categories of draftquality and wp10."

MMiller_WMF updated the task description. (Show Details)May 10 2018, 10:03 PM

In T193807#4198661, @MusikAnimal wrote:

whether checking each and every revision, as they are made, is a good idea.

It would only be every revision to a page that is in the AFC pending list though wouldn't it? (Which is defined by category membership.) So it'd slow down saving of those pages, but maybe that's acceptable? Or if that's no good, it could only calculate the scores when being added to the list.

In T193807#4198930, @Samwilson wrote:

In T193807#4198661, @MusikAnimal wrote:

whether checking each and every revision, as they are made, is a good idea.

It would only be every revision to a page that is in the AFC pending list though wouldn't it? (Which is defined by category membership.) So it'd slow down saving of those pages, but maybe that's acceptable? Or if that's no good, it could only calculate the scores when being added to the list.

I just assumed we were doing it for all pages in Special:NewPagesFeed, or at least that would be useful if we did. I think the updates to tags are deferred, ~~so shouldn't noticeably slow anything down~~ (disregard, not how deferred updates work, apparently!)

MusikAnimal mentioned this in T193809: [Timebox 12hr] Investigation: applying copyvio for page review prioritization.May 15 2018, 2:25 AM

Just to wrap up this investigation:

There are two main options to do this:
- Use the ORES API ourselves, and store the scores in the pagetriage_page_tags table in the page triage database.
- Use the table in the Mediawiki database that Scoring team is already storing scores in. This relies on the Scoring team's timelines to get the draftquality and wp10 models into that table.
We prefer the latter option, provided that it is ready on the timelines that this project will work on.

To continue to follow along with this element of the project, see T195796.

MMiller_WMF closed this task as Resolved.Jun 1 2018, 5:07 PM

MMiller_WMF edited parent tasks, added: T196178: New Pages Feed: ORES addition; removed: T193782: [EPIC] Prioritization tools for AfC and NPP.Jun 1 2018, 5:42 PM

• TBolliger moved this task from Needs Review/Feedback to Q1 2018-19 on the Community-Tech-Sprint board.Jun 4 2018, 8:25 PM

• Vvjjkkii renamed this task from [Timebox 12hr] Investigation: using ORES for page review prioritization to sndaaaaaaa.Jul 1 2018, 1:12 AM

• Vvjjkkii reopened this task as Open.

• Vvjjkkii removed MusikAnimal as the assignee of this task.

• Vvjjkkii triaged this task as High priority.