Page MenuHomePhabricator

New Pages Feed: filtering on ORES scores (3.2)
Closed, ResolvedPublic

Description

The work in this task and in T195796 make up the third useful feature change that we could roll out to users. The work in the task is part of accomplishing these user stories:

  • As a reviewer, I need to be able to filter by the four categories in the ORES draftquality model (vandalism, spam, attack, ok).
  • As a reviewer, I need to be able to filter by the six categories in the ORES wp10 model (Stub, Start, C-class, B-class, Good, Featured).

Specifically, the work is to build on T195796 by allowing users of the New Pages Feed to filter pages based on ORES scores:

  • Filters will need to be added to the New Pages Feed to allow selection on these scores. The same filters will need to be added to the filter menu for both the NPP and AfC use cases. There is a conversation with the reviewing community on whether it would be better to allow reviewers to filter the New Pages Feed using the specific categories produced by the two models (like in Concept A) or to roll up those categories into less granular shortcuts (like in Concept B). We have decided to implement Concept A. In this situation, the user interface should refer to the draftquality model as "Predicted issues" and wp10 as "Predicted class". The values should be shown in the interface as two sets of checkboxes, in these orderings:
      • Predicted issues
        • None (note that this corresponds to the model value "OK")
        • Spam
        • Vandalism
        • Attack
      • Predicted class
        • Stub
        • Start
        • C-class
        • B-class
        • Good
        • Featured
    • When adding these filters to the menu for NPP, all existing filter options in that menu should remain unchanged ("Show", "In namespace", and "That" -- see wireframes below for more clarity).
    • When model categories are selected, the selected categories should be displayed list's header, where the header right now says things like "Showing: unreviewed, blocked users". We want to use @alexhollender's recommended design, which is to take into account the various possibilities for ANDs and ORs by doing it like this: "Active filters: State (Awaiting review), Quality (B-class, C-class), Classification (attack), Copyvio (below 50)". See the attached mockup.
    • Although the ORES API by default returns a specific category (e.g. "spam") in addition to numeric scores for spam and the other categories, it is simply choosing the highest score of the various categories and choosing that category as the winner. The Scoring team recommends that we build in the ability to adjust the score cutoff for our chosen category, which will allow us to tune the usage of the models according to reviewer preferences. The Collaboration team gave themselves this ability to adjust cutoffs with the Recent Changes feed (note: this is about our ability to adjust cutoffs on the software side, not about giving reviewers the ability to set their own cutoffs). Here is what we have decided:

Two other notes:

  • User:SQL made a page that scores the two ORES models on all submitted drafts each day. Perhaps there are some things we can learn from that user's implementation: https://en.wikipedia.org/wiki/User:SQL/AFC-Ores.
  • It will be great if we can sanity check our scores before integrating them into the software. As we're working on this development, it would be good to be able to export lists of scored pages so that humans can look them over and make sure the scores and cutoffs make sense.

Note: the specifics listed above and the wireframes shown below may be changed by ongoing community conversation around the design, which can be found here.

Here are wireframes of what the feed would look like after this work for both the NPP and AfC cases, showing the work from T195545, T195924, and T195547, and now with the ORES models in the filter menu (note that these wireframes do not show many of the details that should remain unchanged, like the info listed with each page in the list):


Event Timeline

MMiller_WMF updated the task description. (Show Details)May 29 2018, 10:24 PM
kaldari set the point value for this task to 8.May 30 2018, 12:29 AM
kaldari edited projects, added Community-Tech-Sprint; removed Community-Tech.
SQL added a subscriber: SQL.May 30 2018, 2:48 AM
TBolliger moved this task from Untriaged to Estimated on the Community-Tech board.

@MMiller_WMF

When model categories are selected, the selected categories should be listed next to the word "Showing" in the list's header. It would be great if @alexhollender could weigh in here on the logic of how to list the selected categories, taking into account the various possibilities for ANDs and ORs. This will be determined during the week of May 28.

  • I wonder if the distinction you're bringing up is relatively nuanced (it took me a few minutes to wrap my head around), i.e. maybe we should be weary of over-solving here. Unless the filter menu said something like "only show results that match all of the following criteria" I think the OR logic is implied. I assume the AND logic gets intuited because people are familiar with the filter options, however maybe we shouldn't rely on that going forward.
  • If I'm understanding this correctly, we don't currently make the distinction. If we did I believe the text for the following filter state would read: Showing (results that are): reviewed OR unreviewed OR nominated for deletion AND ARE orphans

  • I do think it would be helpful to call out which categories the applied filters are part of primarily to help familiarize people with the new filter structure we're introducing. This might also alleviate any confusion related to what you've called out. Hoping something like this would do the trick:

Okay, thanks @alexhollender. It sounds like you're saying that right now the feed doesn't capture that nuance at all, so although we would not be making the logic worse by adding ORES and copyvio to it, it would exacerbate any ambiguity that currently exists. @Samwilson @MusikAnimal -- what do you think about whether it is worthwhile and easy to make that line read more clearly, like in Alex's mockup (the second image in his post)?

I think what it says now is fine ("Showing: unreviewed, nominated for deletion, orphans"). Changing that to "Showing: awaiting review, B-class, C-class, Attack, Low copyvio" is just as easy to understand, to me, since there's no crossover of values for each filter. Alex's mockup works great too, also easy to understand. Structuring it in this way should be easy to implement.

Thanks @MusikAnimal. Let's shoot for @alexhollender's version then, which I think will be a worthwhile investment in whatever the future of this feed becomes.

MMiller_WMF updated the task description. (Show Details)Jun 1 2018, 10:44 PM
Vvjjkkii renamed this task from New Pages Feed: filtering on ORES scores (3.2) to w0baaaaaaa.Jul 1 2018, 1:07 AM
Vvjjkkii triaged this task as High priority.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed the point value for this task.
Vvjjkkii removed a subscriber: Aklapper.
MusikAnimal renamed this task from w0baaaaaaa to New Pages Feed: filtering on ORES scores (3.2).Jul 1 2018, 2:02 AM
MusikAnimal updated the task description. (Show Details)
MusikAnimal set the point value for this task to 8.
MusikAnimal added a subscriber: Aklapper.
MusikAnimal raised the priority of this task from High to Needs Triage.Jul 1 2018, 2:05 AM
kostajh claimed this task.Jul 2 2018, 7:19 PM
kostajh updated the task description. (Show Details)Jul 3 2018, 12:44 PM

To implement the design concepts, we'll need to:

  1. New Pages Feed: switch filters to a three column layout with two rows.
    1. the existing "That" label and filters move to the 2nd row
    2. Add "Predicted class" 6 checkboxes and "Predicted issues" 4 checkboxes in row 1, columns 2 and 3. These are added in getTriageTemplatesHTML()
  2. AFC: change to 3 column layout with one row
    1. "That" label is renamed to "State"
    2. Add "Predicted class" 6 checkboxes and "Predicted issues" 4 checkboxes in row 1, columns 2 and 3, in getTriageTemplatesHTML()
  3. Nice-to-have, seemingly low level of effort: migrate template code from getTriageTemplatesHTML() into PageTriage/includes/templates/newpagesfeed.mustache
  4. Set allowed API params
    1. in listControlNav.js getApiParams
    2. ensure that filters stick after use, update menuSyncNpp and menuSyncAfc
    3. In ApiPageTriageList.php, update getAllowedParams
  5. Update getPageIDs in ApiPageTriageList, update query to use ORES tables as noted in T195796
  6. Update getMetadata in ArticleMetadata.php to load ORES data into article metadata
  7. Update/add tests

I've added T198747 and T198748 based on our team's discussion of the comment above.

MaxSem removed a subscriber: MaxSem.Jul 3 2018, 7:16 PM

I broke out the work to improve the display logic for chosen filters into a separate task: T199277.

SBisson removed the point value for this task.Jul 13 2018, 2:39 PM
SBisson added a subscriber: alexhollender.
Niharika removed a subscriber: Niharika.Jul 13 2018, 7:54 PM

Now that all other ORES tickets are Done, this old parent task is also Done. Anything outstanding is ticketed separately.

MMiller_WMF closed this task as Resolved.Aug 16 2018, 8:05 PM