Page MenuHomePhabricator

Enable srwiki edit quality filters in RecentChanges
Closed, ResolvedPublic

Description

The srwiki advanced edit quality models are deployed and ready for RC integration.

Event Timeline

Halfak added subscribers: Aca, Halfak.

The model is now deployed. @Acamicamacaraca has been asking (at T174687) when we can get the filters enabled.

Halfak renamed this task from Enable srwiki edit quality features to Enable srwiki edit quality filters in RecentChanges.Jun 12 2018, 8:46 PM
Halfak unsubscribed.

So, that is final task. I hope that Collaboration Team will enable filters soon.

Vvjjkkii renamed this task from Enable srwiki edit quality filters in RecentChanges to r6aaaaaaaa.Jul 1 2018, 1:04 AM
Vvjjkkii removed Catrope as the assignee of this task.
Vvjjkkii triaged this task as High priority.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed a subscriber: Aklapper.
CommunityTechBot renamed this task from r6aaaaaaaa to Enable srwiki edit quality filters in RecentChanges.Jul 2 2018, 2:04 PM
CommunityTechBot assigned this task to Catrope.
CommunityTechBot raised the priority of this task from High to Needs Triage.
CommunityTechBot updated the task description. (Show Details)
CommunityTechBot added a subscriber: Aklapper.
In T197012#4319768, @Acamicamacaraca wrote:

@Catrope How's it going? xD

Sorry for the delay, this caught me right in the middle of a busy time.

I looked at the properties of the srwiki model, and while the damaging model is usable, the goodfaith model is not. The highest precision for bad faith that this model can achieve is 23.1% (see queries for >=0.23 and >=0.24), which means we could implement a "may be bad faith" filter (which would have 16.8% precision at 62.5% recall) but not a "likely bad faith" or "very likely bad faith" model, because we want those to have a precision of at least 45% and 60% respectively, and ideally 60% and 90%.

The damaging model is adequate though; it's not the best model we have, but it's workable. We could configure the following filters:

  • Very likely good: 99.5% precision at 100% (?!) recall, or alternatively 100% precision at 90.7% recall
  • May be bad: 15.5% precision at 90.1% recall (we aim for 90% recall or 15% precision, so this fits that perfectly)
  • Likely bad: 45.7% precision at 39.9% recall (normally we aim for 60% precision, but 45% is fine for lower-fit models)
  • Very likely bad: 75% precision at 17.5% recall (normally we aim for 90% precision, but that would lead to 5.7% recall which I think is too low)

@awight The last time we ran into this situation, on T192498: Deploy ORES advanced editquality models to arwiki, I ended up deploying only the damaging model but not the goodfaith model, and you said that we should consider not deploying (or undeploying) low quality models. How do you feel about this case?

Change 444018 had a related patch set uploaded (by Catrope; owner: Catrope):
[operations/mediawiki-config@master] Enable ORES edit quality filters on srwiki (damaging only)

https://gerrit.wikimedia.org/r/444018

If there are no objections, I'm going to deploy this on Monday July 9th at 18:00-19:00 UTC. (cc @Acamicamacaraca )

@Catrope What are you thinking about restarting edits review. Maybe we can get better-quality filters in second try?

Change 444018 merged by jenkins-bot:
[operations/mediawiki-config@master] Enable ORES edit quality filters on srwiki (damaging only)

https://gerrit.wikimedia.org/r/444018

Mentioned in SAL (#wikimedia-operations) [2018-07-09T19:01:20Z] <catrope@deploy1001> Synchronized wmf-config/InitialiseSettings.php: Enable ORES damaging filter on srwiki (T197012) (duration: 00m 50s)

Thank you very much! Works correctly for now!

In T197012#4407757, @Acamicamacaraca wrote:

@Catrope What are you thinking about restarting edits review. Maybe we can get better-quality filters in second try?

That's a question for @awight and @Halfak , they're the ORES experts. I'm just the Recent Changes guy :)

We can definitely do a second round of labels. I'd like to have @notconfusing look for any anomalies in the labeled data though so we can see if there are inconsistencies. I would expect that the goodfaith classifier would work better!

MMiller_WMF subscribed.

I just wanted to post on this task to clarify its status. Given that the Growth team has enabled the filters, and the remainder of the conversation is about improvements to the models, I'm going to resolve this ticket. @Halfak, is there a separate task where it could be good to have that conversation?

@awight created T199355 to look into the model itself. Thanks!