Page MenuHomePhabricator

Update ORES filter thresholds for huwiki
Open, Needs TriagePublic

Description

We have deployed an improved version of the models. The thresholds might need a minor update.

Event Timeline

Halfak created this task.Aug 7 2019, 3:39 PM
Restricted Application added a project: artificial-intelligence. · View Herald TranscriptAug 7 2019, 3:39 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
kostajh added subscribers: SBisson, kostajh.

@SBisson would you have time to take on this one as well?

SBisson added a subscriber: Tgr.Aug 16 2019, 3:23 PM

@SBisson would you have time to take on this one as well?

I could but @Tgr has expressed a special interest in it so I'm happy to step back and support.

Change 536732 had a related patch set uploaded (by Gergő Tisza; owner: Gergő Tisza):
[operations/mediawiki-config@master] Update ORES filter threshold configuration for new huwiki model

https://gerrit.wikimedia.org/r/536732

Tgr added a comment.Sat, Sep 14, 11:06 AM

One thing I noted while playing around with the data is that the frequency of edits matching damaging/likelygood is very low for anons (in the single digits monthly, while total anon edits tend to be between 4K-10K). Does that mean the filter threshold is poorly chosen (although it's high for editors, in the 80-90% range), the model is still biased against anons, or does this simply reflect the fact that anons are harder to trust? (It probably doesn't reflect edit quality - manual checks usually find that between a quarter and a third of anon edits are problematic.)

Tgr added a comment.Sun, Sep 15, 8:15 PM

Also, goodfaith/likelybad and goodfaith/verylikelybad are barely different for anons (see graph here showing the fraction of edits these match monthly). They are fairly different for non-anonymous users but then there are (as one would expect) about 100x more matching anon edits. Could this be a threshhold problem, or a bias problem, or is it completely normal?