the red in my watchlist screams "red alert!" big problem! Lots of likilhood of it being terrible" Wheras my experience so far has been more in a warmer colour (orange, or something), where the change is of need for attention but not screaming at me.
We should set 3 thresholds for color:
- filter_rate_at_recall(min_recall=0.9): yellow (review for completeness)
- filter_rate_at_recall(min_recall=0.75): orange (likely to be damaging)
- recall_at_fpr(max_fpr=0.1): red (almost certainly damaging)
In the case of English Wikipedia's damaging model, this would set the thresholds to (20%, 46%, 94%).
It would be great if we also had some sort of tooltip that read the exact prediction probability like the ScoredRevisions tool. E.g. "85% damaging, 23% goodfaith"