Page MenuHomePhabricator

Finish editquality labeling campaign for fiwiki
Closed, ResolvedPublic

Description

The current models are trained with data from 2016. We should run a new labeling campaign so we can boost the number of observations and the recency of the information.

@Zache helped us use flaggedrevs to try to boost the performance of the ORES model. There are reports that the model statistics don't look as good since we started using the flaggedrevs data. @Zache, what has your experience been?

Either way, I think training with new data is a good idea.

Event Timeline

Damaging model is better than in summer 2017 (before flagged revs data) and it is useful. This is mostly because it gives more stable results (ie bad edits get a higher number than good and numbers are predictable) However, it is weighted so that pretty much values >0.1 means bad edit which I suppose is not on purpose. Nonetheless, it is better now.

Goodfaith model is broken. At some point ORES goodfaith averages of last 25 edits for some user were 1 OR 0 and no numbers between. Now there are other numbers too but still, goodfaith gives happily 0.999 for clear vandalism. (See quarry below for examples) Before flagged revs data the goodfaith model was substantially better than damaging model.

@Halfak btw. there is already slowly ongoing second campaign which we could try to finalize it meets the requirements: https://labels.wmflabs.org/stats/fiwiki/

Oh yes! I remember setting that up. Finish that up would be great. It looks like we're pretty close.

Halfak renamed this task from Create new editquality labeling campaign for fiwiki to Finish editquality labeling campaign for fiwiki.Feb 15 2019, 7:53 PM
Halfak triaged this task as Medium priority.Feb 15 2019, 8:00 PM

T166909 Complete edit quality campaign v2 in Finnish Wikipedia is completed.