Change language describing "Likely" filters to avoid mentioning "May" filters
Closed, ResolvedPublic
Actions

Description

The current description of the ORES "Likely" filters is designed to get around the fact that they cover a wide range of precision/recall results—by just describing the recall relative to other filters:

With medium accuracy, finds more problem edits than the “Very Likely” filter but fewer than “May.”

But with the more flexible way we're setting levels now, three wikis (at present) don't have the "May" filters at all. So a new solution is needed that doesn't refer to them. Also, the existing formula is very wordy.

Technical changes

We will create two "new" "Likely" filters—one in Quality and one in Intent— that have the same names as the current ones and the same threshold assignments. The only change is that these new filters have different description language.
We'll assign the "new" filters to some wikis and the "old" ones to others. (Strictly as internal names, I'm calling the variations "Low" (for low-ft model) and "High) (high-fit). ) Here are the filter assignments:
Quality filter assignments:
- Low-fit: en, pt, cs, fa, nl, ru, tr, et, fi, ro, sq, fr
- High-fit wikis: pl, wd, he.
Intent filter assignments:
- Low-fit: en, pt, cs, fa, nl, ru, tr, et, fi, pl, ro, sq, fr
- High-fit wikis: wd, he.

Language changes

Note that in addition to changing the descriptions of the "Likely" filters, we will also ammend the "Very likely" descriptions, by changing "highly accurate" to "very highly accurate," to help distinguish these from the high-fit "Likely" filters.

Quality filters

[Description text for "Likely have problems"—Low]
With medium accuracy, finds an intermediate fraction of problem edits.

[Description text for "Likely have problems"—High]
With high accuracy, finds most problem edits.

**[Description text for "Very likely have problems"-Both]
Very highly accurate at finding the most obviously flawed or damaging edits.

Intent filters

[Description text for "Likely bad faith"—Low]
With medium accuracy, finds an intermediate fraction of bad-faith edits.

[Description text for "Likely bad faith"—High]
With medium accuracy, finds most bad-faith edits. [yes, "medium" accuracy is correct here.]

**[Description text for "Very likely bad faith"—Both]
Very highly accurate at finding the most obvious bad-faith edits.

Details

	Subject	Repo	Branch	Lines +/-
	Messages for low and high accuracy likelybad filters	mediawiki/extensions/ORES	master	+18 -8

Customize query in gerrit

Related Objects
Search...

Status	Assigned	Task
Duplicate	Qgil	T125545 Phabricator Q&A session for Community Liaisons
Resolved	Qgil	T116025 Goal: Align Community Liaison and Developer Relations project management practices
Resolved	Qgil	T119387 Community Liaison and Developer Relation quarterly goals for January - March 2016
Open	None	T121500 Unify product documentation for users to make it easier to share, translate and edit
Resolved	Johan	T128790 Translation strategy – act and refine
Resolved	Trizek-WMF	T129088 Create an easy to maintain glossary to facilitate documentation translation (help pages and technical documentation)
Resolved	• jmatazzoni	T145875 Create and maintain Edit Review Improvements documentation
Resolved	• DannyH	T171977 Annual Plan 2017-2018, Audiences 5: Increase current editor retention and engagement
Resolved	• DannyH	T171981 Annual Plan 2017-2018, Audiences 5, Goal 2: Give better ways to monitor contributions
Resolved	• jmatazzoni	T157642 Graduate New Filters UX out of beta on Recent Changes on ALL wikis
Resolved	• jmatazzoni	T144458 Launch ERI RC page features as a Beta Feature to all wikis
Resolved	SBisson	T144457 Invite users to opt in to the RC Filters beta from the RC page, and educate them about its features
Resolved	Mooeypoo	T144448 Build all front-end elements for the new Recent Changes (RC) Page user interface
Resolved	Pginer-WMF	T142785 Design interface for displaying and filtering ORES Good-Faith and Damaging scores as well as New Users flag
Resolved	Pginer-WMF	T147295 Design a way to better orient users when combining multiple highlights
Resolved	Pginer-WMF	T147351 Design a way to help users understand the difference between Highlighting and Filtering, and to use both effectively
Resolved	Pginer-WMF	T147549 Design a way to better discover and navigating between filtering and results
Open	None	T150836 Design onboarding and education process for the new Recent Changes filters
Resolved	Pginer-WMF	T147838 Design a way to educate on the option to highlight results on the Recent Changes designs
Open	None	T150312 Design a way to spotlight the new filters on the RC page interface
Resolved	Pginer-WMF	T151994 Facilitate repetitive use for the new Recent Changes filters
Resolved	Pginer-WMF	T154467 Initial exploration for Recent Changes optimization to repetition
Resolved	Mooeypoo	T164128 Allow users to save their filter selections for later reuse
Resolved	Mooeypoo	T164861 Tweaks to Quick Links design & functions
Resolved	Mooeypoo	T165437 [betalabs-regression] Highlighting is not preserved for saved filters
Resolved	Catrope	T166822 Rename Quick Links menu and reword dialog box.
Resolved	SBisson	T171922 Add 'Make this the default' option to bookmarks action menu
Resolved	None	T164548 Move links at top of Recent Changes to a Quick Links menu
Resolved	Trizek-WMF	T164001 Explore the different quick links chosen by communities and displayed on Recentchangestext
Resolved	Catrope	T164617 Get stats on how frequently RC Page related links (at page top) are clicked
Resolved	Catrope	T166623 Actually get the numbers for how frequently RC Page related links are clicked.
Duplicate	None	T167944 Provide default bookmarks for the most used sets of filters
Open	None	T164952 Provide links to other projects' Recent Changes pages in the same language
Resolved	Trizek-WMF	T167945 Announce to communities the change for RecentChangesText for the new filters Beta feature
Resolved	Trizek-WMF	T167529 Document how to use the bookmark system for the Beta Recent Changes filters
Resolved	Mooeypoo	T149385 Approved interface text for RC page interface elements
Resolved	SBisson	T164997 Change language describing "Likely" filters to avoid mentioning "May" filters
Resolved	• jmatazzoni	T144451 Implement enhanced Recent Changes filters (and make them work with the new UI)
Resolved	SBisson	T149637 Implement functionality for RC page 'Experience level' filters
Invalid	None	T149640 Implement functionality for RC page 'User registration' filters
Resolved	SBisson	T149734 Implement functionality for RC page 'Contribution Quality' filters (ORES)
Resolved	SBisson	T149853 Implement functionality for RC page 'User Intent' filters (ORES)
Resolved	None	T149761 Fine-tune and finalize ORES score ranges for the Quality and Intent filters
Resolved	SBisson	T149859 Implement functionality for RC page 'Edit Authorship' filters
Resolved	SBisson	T149862 Implement functionality for RC page 'Automated contribution' filters
Resolved	SBisson	T149863 Implement functionality for RC page 'Significance' filters
Resolved	SBisson	T150060 Implement functionality for RC page 'Type of change' filters
Resolved	• jmatazzoni	T150059 Make sure all Preferences for Recent Changes are compatible with new filtering system/page tools (and that users' preferences carry over)
Resolved	None	T159300 Turn off 'classic' ORES highlighting on the RC page
Resolved	SBisson	T154486 Special:RC - have 'hidepageedits' filter out all Flow changes
Resolved	SBisson	T152061 Implement functionality for RC page 'Review status' filters
Resolved	• jmatazzoni	T150715 Release strategy for RC page improvements: what wikis get the new features when?
Resolved	Trizek-WMF	T146972 Announce and follow up with communities about the New Filters for Recent changes Beta deployment
Resolved	Trizek-WMF	T158332 Announce and follow up with communities group 2 about the New Filters for Recent changes Beta deployment
Resolved	Trizek-WMF	T158333 Announce and follow up with communities group 3 about the New Filters for Recent changes Beta deployment
Resolved	Trizek-WMF	T158335 Announce and follow up with communities group 4 about the New Filters for Recent changes Beta deployment
Resolved	Trizek-WMF	T158336 Announce and follow up with community group 1 about the New Filters for Recent changes Beta deployment
Resolved	Trizek-WMF	T156157 Contact Portuguese Wikipedia about testing ERI Recent Changes Beta project
Resolved	Trizek-WMF	T158042 Followup with Polish Wikipedia about testing ERI filters for Recent Changes
Resolved	• jmatazzoni	T161655 Damaging levels on Polish Wikipedia overlap too much
Resolved	Catrope	T161767 Add more values to test_stats
Resolved	Catrope	T161706 Review ORES prediction visibility on wikis where they are enabled by default
Resolved	SBisson	T161888 Make ORES prediction disappear when the edit is reviewed by someone else with Flagged Revisions
Resolved	Halfak	T163153 Communicate new beta prefs and changes to ORES users specifically
Resolved	Trizek-WMF	T146669 Create dedicated pages for ERI Recent Changes Beta project
Resolved	• jmatazzoni	T141449 Define Edit Review Improvements glossary
Resolved	• jmatazzoni	T145157 Research current "new user" definitions and consider whether we need a different name for the ReviewStream and RC page “new user” flag / filter
Resolved	• jmatazzoni	T151477 Improve Filters for Special:Recent Changes documentation page on mediawiki.org
Resolved	Trizek-WMF	T154889 Mark Edit Review Improvements glossary for translation
Resolved	Pginer-WMF	T147632 Prototype an improved version of Recent Change designs
Resolved	• jmatazzoni	T146333 Research how to present ORES scores to users in a way that is understandable and meets their reviewing goals
Resolved	Catrope	T150959 Integrate a feedback page link in Recent Changes Beta filters
Resolved	Pginer-WMF	T160063 Explore ways to represent visually the ORES-related filters and associated tradeoffs
Resolved	• jmatazzoni	T161015 Add screenshots to the help pages for Recent Changes Filters Beta project

Event Timeline

• jmatazzoni created this task.May 10 2017, 11:17 PM

• jmatazzoni moved this task from Untriaged to Product/Design Work on the Collaboration-Team-Triage (Collab-Team-Q4-Apr-Jun-2017) board.

Likely have problems [Low]
Likely have problems [High]

After a first quick read, I though "Low what?", "High what?". I'm afraid most people just do a quick scan.

Likely have problems [Low]
With medium accuracy, finds an intermediate percentage of problem edits.

The sub-line helps, but the [Low] one has a medium accuracy? The [High] one has a "high accuracy":

Likely have problems [High]
With high accuracy, finds most problem edits.

What about have a description focusing on the spectrum that catches problematic edits?

"Likely have problems [Large spectrum]"
"Likely have problems [Narrow spectrum]"

"Likely have problems [Global]"
"Likely have problems [Precise]"

Roan suggested one way to solve the problem: we can create "new" filters that have the same names as the old ones, but different descriptions. Then use the "new" filters to some wikis and the "old" ones to others. Strictly as internal names, I'm calling the variations "Low" (for low-ft model) and "High) (high-fit).

Even if these are separate filters internally, the final results for our users is that the "Likely have problems" filter will use a different description based on how precise it's underlying ORES model is. This sounds good to me, and the proposed text for the descriptions work well. I also assume that the "[Low]" and "[High]" indicators are just clarifications for the ticket, ant they won't be part of the filter titles or exposed to users in any way.

In T164997#3254587, @Pginer-WMF wrote:

I also assume that the "[Low]" and "[High]" indicators are just clarifications for the ticket, ant they won't be part of the filter titles or exposed to users in any way.

That's how I read it too, yes.

It'd be good to avoid referring to other filters - the filters maybe dropped/merged etc, e.g. in plwiki where the number of filters differ from enwiki; hopefully, Polish translation of descriptions does not mention the non-existing filter.

Screen Shot 2017-05-17 at 12.17.20 PM.png (459×689 px, 95 KB)

• jmatazzoni updated the task description. (Show Details)May 26 2017, 11:28 PM

BTW, if you want to check the actual stats agains the new language, here are the precision/recall figures:

[Description text for "Likely have problems"—Low]
With medium accuracy, finds an intermediate fraction of problem edits.
[actual precision/recall stats: 64/26, 61/38, 61/82, 61/57, 46/39, 47/14, 63/46]

[Description text for "Likely have problems"—High]
With high accuracy, finds most problem edits.
[actual precision/recall stats:: pl: 80/91, WD: 76/95, HE: 49/89]

[Description text for "Likely bad faith"—Low]
With medium accuracy, finds an intermediate fraction of bad-faith edits.
[ precision/recall stats: 62/23, 61/38, 63/66, 49/18, 62/62, 53/22, 52/12, 63/61, 62/38]

[Description text for "Likely bad faith"—High]
With medium accuracy, finds most bad-faith edits. [yes, "medium" accuracy is correct here.]
[ precision/recall stats: WD: 60/96, HE: 55/88 ]

• jmatazzoni updated the task description. (Show Details)May 26 2017, 11:33 PM

• jmatazzoni moved this task from Product/Design Work to Ready for Pickup on the Collaboration-Team-Triage (Collab-Team-Q4-Apr-Jun-2017) board.

Technical question here - it seems like we need to have 2 sub-messages for what we current have as a single message (so, splitting the description for "Likely have problems" to the "low" case and "high" case) and then making sure that specific wikis (regardless of interface language used on them!) receives each specific one.

@Catrope how do we technically do this? Do we create 2 messages for translation, and then let the back-end decide which to ship based on a list of wikis? Do we need to create another sort of global-global variable (or config option?) to set apart the list of wikis per message?

Am I over-analyzing this, or do we need to create the infrastructure to do this? I was going to implement, but then got blocked on how to make sure each wiki gets the correct low/high representation.

• jmatazzoni edited projects, added Collaboration-Team-Triage (Collab-Team-This-Quarter); removed Collaboration-Team-Triage (Collab-Team-Q4-Apr-Jun-2017).Jul 12 2017, 9:56 PM

• jmatazzoni moved this task from Untriaged to Ready for Pickup on the Collaboration-Team-Triage (Collab-Team-This-Quarter) board.Jul 12 2017, 10:18 PM

• jmatazzoni mentioned this in T170501: Graduate 'New filters for edit review' out of beta, making it standard on all wikis.Jul 12 2017, 10:53 PM

• jmatazzoni mentioned this in Collaboration-Team-Triage (Collab-Team-This-Quarter).Jul 17 2017, 8:53 PM

In T164997#3297713, @Mooeypoo wrote:

Technical question here - it seems like we need to have 2 sub-messages for what we current have as a single message (so, splitting the description for "Likely have problems" to the "low" case and "high" case) and then making sure that specific wikis (regardless of interface language used on them!) receives each specific one.

@Catrope how do we technically do this? Do we create 2 messages for translation, and then let the back-end decide which to ship based on a list of wikis? Do we need to create another sort of global-global variable (or config option?) to set apart the list of wikis per message?

Am I over-analyzing this, or do we need to create the infrastructure to do this? I was going to implement, but then got blocked on how to make sure each wiki gets the correct low/high representation.

My suggestion (and feel free to disagree with it or offer other suggestions) was to create two filters/levels, one called e.g. likely-high and the other likely-low. They would both have separate i18n messages, of course, but we would configure the wikis so that one of them is always disabled.

• jmatazzoni mentioned this in T157642: Graduate New Filters UX out of beta on Recent Changes on ALL wikis.Jul 21 2017, 7:26 PM

• jmatazzoni added a parent task: T157642: Graduate New Filters UX out of beta on Recent Changes on ALL wikis.Jul 21 2017, 7:32 PM

In T164997#3450521, @Catrope wrote:

My suggestion (and feel free to disagree with it or offer other suggestions) was to create two filters/levels, one called e.g. likely-high and the other likely-low. They would both have separate i18n messages, of course, but we would configure the wikis so that one of them is always disabled.

That works. Another somewhat similar option is to allow for messages override where the filter levels are configured (per wiki).

"OresFiltersThresholds": {
	"damaging": {
		"likelygood": { "min": 0, "max": "recall_at_precision(min_precision=0.995)" },
		"maybebad": false,
		"likelybad": { "min": "recall_at_precision(min_precision=0.6)", "max": 1, "messages": { "description": "...likelybad-high" } },
		"verylikelybad": { "min": "recall_at_precision(min_precision=0.9)", "max": 1 }
	},

Yet another option is to have both likelybad-low and likelybad-high messages defined as described in this ticket but let the code pick one based on the presence of the maybe-bad filter level. Yes, it's hardcoded, like similar rules around filters (subset, conflict).

Thinking about this again, all of these solutions are about equally messy, but your second one (have the code pick the message) requires the least outgoing maintenance in the config file, so I think we should prefer that.

In T164997#3481953, @Catrope wrote:

Thinking about this again, all of these solutions are about equally messy, but your second one (have the code pick the message) requires the least outgoing maintenance in the config file, so I think we should prefer that.

I agree. Also very easy to implement.

To save translators' time, should we say that the current messages correspond to -low and new ones will be created with the suffix -high or should we remove the current ones and create both -low and -high?

EDIT: Looking at the text of the messages, both -low and -high have new wording and need to be translated. I don't know if there's any point in keeping the old messages (without any suffix).

SBisson claimed this task.Jul 28 2017, 5:53 PM

SBisson moved this task from Ready for Pickup to In Development on the Collaboration-Team-Triage (Collab-Team-This-Quarter) board.

Change 368463 had a related patch set uploaded (by Sbisson; owner: Sbisson):
[mediawiki/extensions/ORES@master] Messages for low and high accuracy likelybad filters

https://gerrit.wikimedia.org/r/368463

gerritbot added a project: Patch-For-Review.Jul 28 2017, 6:00 PM

SBisson moved this task from In Development to Needs Review on the Collaboration-Team-Triage (Collab-Team-This-Quarter) board.Jul 28 2017, 6:10 PM

Catrope moved this task from Needs Review to QA Review on the Collaboration-Team-Triage (Collab-Team-This-Quarter) board.Jul 28 2017, 10:38 PM

Change 368463 merged by jenkins-bot:
[mediawiki/extensions/ORES@master] Messages for low and high accuracy likelybad filters

https://gerrit.wikimedia.org/r/368463

ReleaseTaggerBot added a project: MW-1.30-release-notes (WMF-deploy-2017-08-01_(1.30.0-wmf.12)).Jul 28 2017, 11:00 PM

Message 'ores-rcfilters-goodfaith-bad-desc-high' currently says "With medium accuracy...". Message name and analogous 'damaging' filter message suggest that it intends to say "With high accuracy..." instead.

In T164997#3487319, @Pikne wrote:

Message 'ores-rcfilters-goodfaith-bad-desc-high' currently says "With medium accuracy...". Message name and analogous 'damaging' filter message suggest that it intends to say "With high accuracy..." instead.

@Pikne , thanks for your comment. Can you say what wiki you're looking at?

In T164997#3487355, @jmatazzoni wrote:

@Pikne , thanks for your comment. Can you say what wiki you're looking at?

It's from the above patch for which I was translating new messages on translatewiki.net.

In T164997#3487319, @Pikne wrote:

Message 'ores-rcfilters-goodfaith-bad-desc-high' currently says "With medium accuracy...". Message name and analogous 'damaging' filter message suggest that it intends to say "With high accuracy..." instead.

From the task description:

[Description text for "Likely bad faith"—High]
With medium accuracy, finds most bad-faith edits. [yes, "medium" accuracy is correct here.]

So it looks like this asymmetry was a deliberate choice on @jmatazzoni's part.

In T164997#3487467, @Catrope wrote:

In T164997#3487319, @Pikne wrote:

Message 'ores-rcfilters-goodfaith-bad-desc-high' currently says "With medium accuracy...". Message name and analogous 'damaging' filter message suggest that it intends to say "With high accuracy..." instead.

From the task description:

[Description text for "Likely bad faith"—High]
With medium accuracy, finds most bad-faith edits. [yes, "medium" accuracy is correct here.]

So it looks like this asymmetry was a deliberate choice on @jmatazzoni's part.

Yes, the precision for the two high fit models is 58 and 60%. So medium.

3 wikis have been added since this task was written: ro, sq, fr. They are all low-fit models, and I'm adding them to the Description. But @SBisson, are you definitely automating this? Good idea!

• jmatazzoni updated the task description. (Show Details)Jul 31 2017, 9:00 PM

@SBisson Out of the list

High-fit wikis: pl, wd, he

only hewiki exist in betalabs, but ORES-based filters are no enabled there. hewiki prodaciton has such filters. Any reason why we do not have them in the beta? I checked both types of users (just in case) with ores-enabled up_value equals 0 and 1.

This is how it works in the code (for both damaging and goodfaith):

if maybebad is present -> likelybad uses the -low message
if maybebad is NOT present -> likelybad uses the -high message

There is no new config to deploy for individual wikis.

Wording for Low-fit wikis was checked in betalabs. Waiting for wmf.12 deployment to check High-fit wikis

Checked in wmf.12 - all filters' descriptions have been updated.

QA Recommendation: Resolve

Etonkovidova moved this task from QA Review to Product Review on the Collaboration-Team-Triage (Collab-Team-This-Quarter) board.Aug 4 2017, 10:24 PM

• jmatazzoni closed this task as Resolved.Aug 4 2017, 11:46 PM

	F8113023: Screen Shot 2017-05-17 at 12.17.20 PM.png
	May 17 2017, 7:21 PM

Change language describing "Likely" filters to avoid mentioning "May" filtersClosed, ResolvedPublicActions