The work in this task and in T195927 make up the third useful feature change that we could roll out to users. The work in the task is part of accomplishing these user stories:
- As a reviewer, I need to be able to filter by the four categories in the ORES draftquality model (vandalism, spam, attack, ok).
- As a reviewer, I need to be able to filter by the six categories in the ORES wp10 model (Stub, Start, C-class, B-class, Good, Featured).
Specifically, the work is to generate scores, while the filtering of scores is tasked in T195927:
- For all pages in the New Pages Feed, including the Article, Draft, and User namespaces, generate scores from both the draftquality and wp10 models described on the ORES page. It is unlikely that we use the scores from the User namespace for anything.
- The draftquality model returns six scores and the wp10 model returns four scores. I recommend that we store or otherwise have access to all ten of those scores, and separately apply logic to determine which classes to display for those two models, as described in T195927. This may give us needed flexibility to change our logic later on.
- In terms of when to score the models, we have some flexibility. Ideally, we would be able to score pages upon their first appearance in the New Pages Feed, and rescore them on each successive edit as long as they are part of the feed. Rescoring the models is more important for NPP work than AfC work, as new articles do tend to be edited soon after they are first created, whereas new drafts submitted to AfC do not. We have two main options here, according to @Halfak and @Ladsgroup:
- The Scoring team is currently in the process of adding the draftquality and wp10 models to the set of models that are already being rescored on every edit, storing the scores in the Mediawiki database for uses like the New Pages Feed (T190471). We need to decide whether this will work for our use case and on the timeline on which we're working.
- New Pages Feed could also use the ORES API to query for new model scores. As a side note, for fastest throughput, @Halfak recommends two parallel connections requesting scores on 50 pages at a time.
- If we decide for technical reasons that rescoring models with every edit is not a good idea, these are some potential alternative business rules that the team can discuss:
- Rescore models once a day (or other time period) on pages that have been changed in the previous day (or other time period).
- Rescore models on a given page after a certain number of edits
- Rescore models on a given page after a certain number of bytes changed.
Two other notes:
- User:SQL made a page that scores the two ORES models on all submitted drafts each day. Perhaps there are some things we can learn from that user's implementation: https://en.wikipedia.org/wiki/User:SQL/AFC-Ores.
- It will be great if we can sanity check our scores before integrating them into the software. As we're working on this development, it would be good to be able to export lists of scored pages so that humans can look them over and make sure the scores and cutoffs make sense.
Note: the specifics listed above may be changed by ongoing community conversation around the design, which can be found here.