Page MenuHomePhabricator

Implement functionality for RC page 'User Intent' filters (ORES)
Closed, ResolvedPublic

Description

The User Intent filters are based on the ORES good-faith test. By providing predictions about which edits were likely made in good faith or bad, they enable reviewers make better decision about how to remediate edits generally and, in particular, to find good-faith users who need help. Based on the explorations in T146333, we've settled on three User Intent options (view in prototype). .

In creating these three levels, we strove to balance users' desires for accuracy versus breadth of coverage. Below see the ORES Good Faith score range for each filter (in square brackets, subject to finalization in T149761). For final filter names and description texts, see T149385.

  • Very likely good faith [35% -100%]
  • May be bad faith [0% - 65%]
  • Likely bad faith [0%-15%]

Notes about the User Intent filters

  • These filter ranges will, at least initially, be identical with those built into the ReviewStream feed.
  • Because the "bad faith" filters overlap, they are subject to the behavior stated in T149391, under 'No-Effect Display States", and T149452, under "'Excluded Display States."

About the functionality of all new RC page filters generally

  • Like all filters in the enhanced RC page filter UI, these filters conform to a set of rules that are, in some ways, very different from the existing RC page filters. The existing filters are designed primarily to EXCLUDE selected properties. The new filters are intended to INCLUDE those properties; logically, the filters within each new group constitute a set of OR filters (each group of ORs being connected to other groups by ANDS). So, the filters in this group follow these rules:
    • To INCLUDE property A, users check the box for property A.
    • To EXCLUDE property A, the user must uncheck A and check it's complements, properties B and C.
    • If NONE of A, B or C are checked, then ALL are included.
    • If ALL of A, B and C are checked, then the result is the same: ALL are included.
  • As per T146076, searches on the RC page are meant to be bookmarkable. Please make sure your search adds query strings to the URL.

Event Timeline

Change 323327 had a related patch set uploaded (by Sbisson):
RC/Watchlist: Filter out parameters that cannot be displayed

https://gerrit.wikimedia.org/r/323327

Change 323328 had a related patch set uploaded (by Sbisson):
[WIP] goodfaith filter

https://gerrit.wikimedia.org/r/323328

Change 323327 merged by jenkins-bot:
RC/Watchlist: Filter out parameters that cannot be displayed

https://gerrit.wikimedia.org/r/323327

Change 323328 merged by jenkins-bot:
'goodfaith' filter on Special:RC / Special:Watchlist

https://gerrit.wikimedia.org/r/323328

@SBisson 'goodfaith' filter (with any option) displays the entries for non-edit actions: usera account creation, blocked users, moved pages, e.g.
(User creation log); 03:06 . . User account SomeUser (talk | contribs) was created ‎

'damaging' filter does not display them.

Since such actions are not subject for ORES scoring, should they be excluded from 'goodfaith' filter too?

@SBisson 'goodfaith' filter (with any option) displays the entries for non-edit actions: usera account creation, blocked users, moved pages, e.g.
(User creation log); 03:06 . . User account SomeUser (talk | contribs) was created ‎

'damaging' filter does not display them.

Since such actions are not subject for ORES scoring, should they be excluded from 'goodfaith' filter too?

That's not supposed to happen. Log entries do not even have a revision id to attach a score to. The inner join with the ores tables filters then out anyway.

Make sure you are testing it in an environment where the goodfaith test is enabled. I don't think it is in betalabs.

The way I do it locally is:

  1. Enable the goodfaith test: $wgOresModels = array( 'damaging' => true, 'goodfaith' => true, 'reverted' => false, 'wp10' => false );
  2. Create the goodfaith entry in the ores_model table: insert into ores_model (oresm_name, oresm_version, oresm_is_current) values ('goodfaith', '0.0.3', 1);
  3. Fetch the goodfaith model id: select oresm_id from ores_model where oresm_name = 'goodfaith';
  4. Seed the ores_classification table with random data: insert into ores_classification (oresc_rev, oresc_model, oresc_class, oresc_probability, oresc_is_predicted) select rev_id, <goodfaith model id>, 1, rand(), IF(rand() > 0.5, 1, 0) from revision;

Re-checked in vagrant with goodfaith enabled - all is working as expected.

QA recommendation: Resolve.

No one expects to get ORES scores for things like log entries, so I don't think the discussion above points up any issues with this ticket. But If I understand the discussion correctly, we've just identified a number of filter combinations that will never return any results.

Namely settings will be in conflict if users choose any of the ORES filters in combination with any of the Type of Change filters, excluding Page Creation and Page Edits (i.e., Category Changes, Wikidata Edits and Logged Entries.)
So, we need to take care of a few things:

  1. I need to define a new display state similar to the "Conflict between Unregistered & Experience filters" defined in T149391 (along with the associated tooltips).
  2. Roan mentions that in cases where filters are in conflict, we should provide a function that short-circuits the search, so the servers don't waste a lot of energy trying to find impossible stuff. I'll add a new ticket to make sure that happens.

@SBisson, am I correct in saying above that Page Creation and Page Edits are the only Type of Change filters that won't conflict with ORES filters?

@jmatazzoni - do you have some specific cases when ORES filters will be in conflict with 'Type of change' filters - 'Hide Wikidata'(hideWikibase), ' Hide page categorization'(hidecategorization), and hidelog.

For example, damaging=maybebad - displays Edits/pages that are may be bad and changes in Categories.
But damaging=maybebad&hidecategorization=1 - i.e. 'show me edits that may be bad but do not show any changes in Categories' - will display only edits that are may be bad.

Let me know which cases you would like to check. I did some testing for filter combination and did not have any unexpected results. However, there are, of course, many combinations that might be tested and it'd be great if we can narrow down potentially most problematic cases or can identify the cases that are especially important.

@SBisson, am I correct in saying above that Page Creation and Page Edits are the only Type of Change filters that won't conflict with ORES filters?

I think that's correct.

@jmatazzoni - do you have some specific cases when ORES filters will be in conflict with 'Type of change' filters

My understanding is that filtering by any Quality or Intent (ORES) filter in combination with any of the filters below would produce no possible results:

  • Category Changes
  • Wikidata Edits
  • Logged Actions

The reason being that Category Changes, Wikidata Edits and Logged Actions are not scored by ORES. So, if you ask to see the intersection, for example, of all edits that May Have Problems AND Logged Actions, the answer will always be null, because no edit will meet that requirement. Right?

@SBisson, am I correct in saying above that Page Creation and Page Edits are the only Type of Change filters that won't conflict with ORES filters?

I think that's correct.

@SBisson, I'd like to define the conflict states, so we can make the interface handle them consistently and include these cases. But I don't want to do it if that's not how this works. Can you please confirm that the three filters above are not scored by ORES? .

@jmatazzoni

(1) It turned out that Category changes do get evaluated by ORES (as page edits), so the adding/removing Category will be subject to damaging filters.
Is it expected?

(2)

if you ask to see the intersection, for example, of all edits that May Have Problems AND Logged Actions, the answer will always be null, because no edit will meet that requirement.

That specific example will show only edits that May Have Problems - logged actions will be filtered out by any of damaging filters

(3) Probably we should give some thoughts to which filters (or categories of filters) have AND or OR relationships? At least, some to identify some use cases.

  • I want to see my own edits that are marked as 'Very likely bad' - filters used "Your own edits" and "Very likely have problems" (?)
  • I want to see my own edits and newcomers edits - there is no intersection between that categories, so what user will see?
  • I want to see logs entries and my own edits that were marked as 'Very likely bad'

OK. I discussed with Elena, and it looks like I need to define a conflict state for

  • Any ORES filter + Wikidata Edits
  • Any ORES filter + Logged Actions

And with that, I'm closing this ticket.