Add more values to test_stats
Closed, ResolvedPublic
Actions

Description

To https://ores.wikimedia.org/scores/enwiki/damaging/?model_info=test_stats etc, please add:

recall_at_precision(min_precision=0.6)
recall_at_precision(min_precision=0.75)
recall_at_precision(min_precision=0.99)
recall_at_precision(min_precision=0.995)

Related Objects
Search...

Status	Assigned	Task
Duplicate	Qgil	T125545 Phabricator Q&A session for Community Liaisons
Resolved	Qgil	T116025 Goal: Align Community Liaison and Developer Relations project management practices
Resolved	Qgil	T119387 Community Liaison and Developer Relation quarterly goals for January - March 2016
Open	None	T121500 Unify product documentation for users to make it easier to share, translate and edit
Resolved	Johan	T128790 Translation strategy – act and refine
Resolved	Trizek-WMF	T129088 Create an easy to maintain glossary to facilitate documentation translation (help pages and technical documentation)
Resolved	• jmatazzoni	T145875 Create and maintain Edit Review Improvements documentation
Resolved	• DannyH	T171977 Annual Plan 2017-2018, Audiences 5: Increase current editor retention and engagement
Resolved	• DannyH	T171981 Annual Plan 2017-2018, Audiences 5, Goal 2: Give better ways to monitor contributions
Resolved	• jmatazzoni	T157642 Graduate New Filters UX out of beta on Recent Changes on ALL wikis
Resolved	• jmatazzoni	T144458 Launch ERI RC page features as a Beta Feature to all wikis
Resolved	• jmatazzoni	T150715 Release strategy for RC page improvements: what wikis get the new features when?
Resolved	Trizek-WMF	T146669 Create dedicated pages for ERI Recent Changes Beta project
Resolved	• jmatazzoni	T141449 Define Edit Review Improvements glossary
Resolved	• jmatazzoni	T145157 Research current "new user" definitions and consider whether we need a different name for the ReviewStream and RC page “new user” flag / filter
Resolved	• jmatazzoni	T151477 Improve Filters for Special:Recent Changes documentation page on mediawiki.org
Resolved	Trizek-WMF	T154889 Mark Edit Review Improvements glossary for translation
Resolved	Pginer-WMF	T147632 Prototype an improved version of Recent Change designs
Resolved	• jmatazzoni	T146333 Research how to present ORES scores to users in a way that is understandable and meets their reviewing goals
Resolved	Catrope	T150959 Integrate a feedback page link in Recent Changes Beta filters
Resolved	Pginer-WMF	T160063 Explore ways to represent visually the ORES-related filters and associated tradeoffs
Resolved	• jmatazzoni	T161015 Add screenshots to the help pages for Recent Changes Filters Beta project
Open	None	T142782 Explore process for turning on RCPatrol for English and other relevant wikis
Resolved	Trizek-WMF	T158004 Release RC Page filtering to non-ORES wikis
Resolved	Trizek-WMF	T158225 Enable the ORES good faith and damaging UI by default, on wikis that have these ORES models available (instead of behind a Beta Feature)
Resolved	Trizek-WMF	T159223 Inform communities about the release of the ORES good faith and damaging UI by default
Resolved	Trizek-WMF	T146972 Announce and follow up with communities about the New Filters for Recent changes Beta deployment
Resolved	Trizek-WMF	T158336 Announce and follow up with community group 1 about the New Filters for Recent changes Beta deployment
Resolved	Trizek-WMF	T158042 Followup with Polish Wikipedia about testing ERI filters for Recent Changes
Resolved	• jmatazzoni	T161655 Damaging levels on Polish Wikipedia overlap too much
Resolved	Catrope	T161767 Add more values to test_stats
Resolved	Catrope	T161706 Review ORES prediction visibility on wikis where they are enabled by default
Resolved	SBisson	T161888 Make ORES prediction disappear when the edit is reviewed by someone else with Flagged Revisions

Event Timeline

Catrope created this task.Mar 29 2017, 9:55 PM

It looks like filter_rate is not the same as precision. Is there a way to find out what the precision is at the threshold produced by filter_rate_at_recall(min_recall=N)? Ditto for filter_rate_at_fpr(max_fpr=N).

OK. This can be done. Adding a precision_at_recall metric would be a way we could cludge this in.

"filter_rate_at_recall" is a process optimizing metric whereas "precision_at_recall" is not. The process doesn't care whether how often a true prediction is a real true (precision). It cares what proportion of recent_changes items do not need to be reviewed (filter_rate). Instead, it seems that "precision_at_recall" provides some important information to a user so that they can set expectations.

Just thinking about this now, it seems we should probably have a basic set of test statistics at any threshold that we're optimizing for. "precision" seems like useful information but doesn't get reported for all thresholds.

We should probably have all 4 of the following fields reported for each threshold statistic:

threshold
precision
recall
filter rate

I've removed the recall asks for now because it doesn't look like we'll need them in the short term. That said, precision_at_recall() would still be quite helpful. Having all 4 fields reported for every statistic would also be tremendously helpful.

Catrope updated the task description. (Show Details)Apr 4 2017, 10:20 PM

Catrope mentioned this in rOEQ1abe5168f53b: Add precision stats for 60%, 75%, 99% and 99.5%.Apr 4 2017, 10:36 PM