Maniphest T199355

Investigate srwiki goodfaith model, why is it so bad?
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	awight
	Jul 11 2018, 6:18 PM

Description

In T197012, @Catrope discovered that our goodfaith model is unusable. Let's look at anomalies in the training data and try to solve the underlying issue, then train a new model.

Related Objects
Search...

Status	Assigned	Task
Open	None	T227094 Update RC Filters for new ORES capacities (July, 2019)
Resolved	SBisson	T225561 Update ORES thresholds for nlwiki
Open	None	T223273 Update srwiki thresholds for goodfaith model
Resolved	SBisson	T225562 Deploy ORES filters for zhwiki
Open	None	T225563 Deploy ORES filters for jawiki
Resolved	Halfak	T224484 ORES deployment: Early June
Resolved	• Catrope	T197012 Enable srwiki edit quality filters in RecentChanges
Resolved	Halfak	T199355 Investigate srwiki goodfaith model, why is it so bad?
Resolved	None	T220556 New labeling campaign for srwiki

Event Timeline

awight created this task.Jul 11 2018, 6:18 PM

Restricted Application removed a project: Patch-For-Review. · View Herald TranscriptJul 11 2018, 6:18 PM

MMiller_WMF mentioned this in T197012: Enable srwiki edit quality filters in RecentChanges.Jul 11 2018, 6:26 PM

Support from me. One question, if we start training new model, will ORES still be in RC?

In T199355#4417907, @Acamicamacaraca wrote:

One question, if we start training new model, will ORES still be in RC?

Yes, a new labeling campaign won't affect anything that's already deployed.

Aca awarded a token.Jul 11 2018, 11:03 PM

MMiller_WMF unsubscribed.Jul 11 2018, 11:17 PM

SBisson removed • Catrope as the assignee of this task.Jul 12 2018, 1:30 PM

SBisson removed a project: Growth-Team (Sprint 0 (Growth Team)).

Harej moved this task from Unsorted to Ideas on the Machine-Learning-Team board.Jul 23 2018, 5:07 PM

Harej moved this task from Ideas to Research & analysis on the Machine-Learning-Team board.

Liuxinyu970226 added a project: Serbian-Sites.Jan 11 2019, 8:09 AM

awight unsubscribed.Mar 21 2019, 4:03 PM

Halfak edited projects, added Machine-Learning-Team (Research); removed Machine-Learning-Team.Apr 2 2019, 9:33 PM

Sorry this one as left hanging for a while. We'll be giving this priority and doing some exploration ASAP.

First thing I want to try is to re-review the "bad-faith" labeled edit from the campaign to see if there is something weird going on. I'll ping soon with some information about that.

Halfak triaged this task as High priority.Apr 9 2019, 9:28 PM

Harej edited projects, added Machine-Learning-Team (Active Tasks); removed Machine-Learning-Team (Research).Apr 9 2019, 9:29 PM

Harej edited projects, added Machine-Learning-Team; removed Machine-Learning-Team (Active Tasks).

Halfak moved this task from Research & analysis to Ready to go on the Machine-Learning-Team board.Apr 9 2019, 9:29 PM

I've dumped all of the edits labeled "badfaith" into this etherpad: https://etherpad.wikimedia.org/p/srwiki_badfaith_edits

There are ~450 of them -- which is quite a lot. We don't need to re-label them all, but maybe we can check what is there. @Acamicamacaraca, I can do a bit of work using translation utilities, but I'd appreciate it if you could look at 25-50 of these and just write a short description of what you're seeing and if you agree with the label. I'd really be interested in any help from other srwiki-pedians too :)

I just labeled a few. I'm seeing some edits that look like they are goodfaith in this set. I wonder if I am missing something.

I labeled some 30+ diffs

I want to help. How I can save edits in list at etherpad?

An etherpad is directly editable. You should be able to just type into it.

So, as it stands, more than half of the items labeled badfaith are actually goodfaith upon review. I'll look into these labels to see if I can see some sort of consistency with them.

It looks like a lot of the edits that were labeled "badfaith" but that we have no re-labeled "goodfaith" were saved by @Zoranzoki21. That might be simply because Zoranzoki21 did a lot of labeling work. Would you take a look at them to see if you agree with our re-assessment? Maybe there is some confusion as to the meaning of "goodfaith".

In T199355#5143550, @Halfak wrote:

An etherpad is directly editable. You should be able to just type into it.

Yes, thanks!

In T199355#5143703, @Halfak wrote:

It looks like a lot of the edits that were labeled "badfaith" but that we have no re-labeled "goodfaith" were saved by @Zoranzoki21. That might be simply because Zoranzoki21 did a lot of labeling work. Would you take a look at them to see if you agree with our re-assessment? Maybe there is some confusion as to the meaning of "goodfaith".

I checked diffs from lines 59 to 70. Will check others too.

OK it's clear that we would benefit from re-labeling these 500 revisions using Wiki labels. I'm working to get a campaign loaded. I'd like to call it something like "Edit quality (500 edits re-review)" or something like that. Could someone help me get a Serbian translation of that?

In the meantime, I added the campaign here: https://labels.wmflabs.org/ui/srwiki/ Please pick up these edits and re-label them as we were doing in the etherpad. Once we're done with this, we can re-examine the data and update the training/testing set.

Halfak claimed this task.Apr 30 2019, 8:57 PM

Halfak edited projects, added Machine-Learning-Team (Active Tasks), Wikilabels, editquality-modeling; removed Machine-Learning-Team.

Restricted Application added a project: artificial-intelligence. · View Herald TranscriptApr 30 2019, 8:57 PM

In T199355#5148701, @Halfak wrote:

OK it's clear that we would benefit from re-labeling these 500 revisions using Wiki labels. I'm working to get a campaign loaded. I'd like to call it something like "Edit quality (500 edits re-review)" or something like that. Could someone help me get a Serbian translation of that?

@Halfak Thanks! I working now on it. Translation on Serbian of this is: "Квалитет измена (поновни преглед 500 измена)"

257 labels left. I will end with this until the end of day.

79 labels left @Halfak.

I had some problems on end, but I talked with @Halfak at IRC and he resolved so I successfully completed all.

Just sat down with this again. Here's the old dataset:

edits	damaging	goodfaith
10	False	False
119212	False	True
447	True	False
225	True	True

And the new re-labeled dataset:

edits	damaging	goodfaith
0	False	False
119469	False	True
151	True	False
274	True	True

Just at a glance this looks way more reasonable. In the original edits, we had 10 edits labeled as not damaging, but still "badfaith". Now those have disappeared and we've gone from 447 badfaith edits to 151.

I'm re-training the models now. I'll report back tomorrow on the fitness we get.

Diffusion mentioned this in rOEQe0bad9dc6680: Updates srwiki model with new goodfaith data..May 14 2019, 12:54 PM

Huge boost in model fitness! This is now one of the best "goodfaith" models that we have! I've submitted my work for review. See https://github.com/wikimedia/editquality/pull/195 Will update about deployments of the new model when that is ready.

Halfak moved this task from Parked to Review on the Machine-Learning-Team (Active Tasks) board.May 14 2019, 1:26 PM

Halfak mentioned this in T223273: Update srwiki thresholds for goodfaith model.May 14 2019, 1:29 PM

Halfak added a parent task: T223273: Update srwiki thresholds for goodfaith model.

Wow. From mud to gold :)

In T199355#5193386, @Acamicamacaraca wrote:

Wow. From mud to gold :)

This rhymes in Serbian :D

Aca closed subtask T220556: New labeling campaign for srwiki as Resolved.May 18 2019, 2:42 PM

What about damage model? I forgot to ask is it good. We implemented it on sr.wiki already.

We got a minor improvement for the "damaging" model too but the change is really too small to meaningful.

FYI, I'm still waiting on review for this change. My team is a bit understaffed at the moment, so I need to rely on external reviewers. Sorry for the delay!

• Petar.petkovic mentioned this in T224484: ORES deployment: Early June.May 28 2019, 9:33 PM

Diffusion mentioned this in rOEQ4e282dd12fca: Updates srwiki model with new goodfaith data..May 30 2019, 4:03 PM

Halfak moved this task from Review to Completed on the Machine-Learning-Team (Active Tasks) board.Jun 3 2019, 3:57 PM

Can I notify community about this? I saw you have created patch on Gerrit.

It looks like we're going to get this deployed next week. I'm aiming for Monday, June 17th.

We've been blocked for a while on a few issues. E.g. an issue with our source code control/deployment system (T224996) and now we're blocked on deployment while the "Site Reliability Engineering" team has an offsite this week.

If you'd like to make a announcement, I think that is a great idea. Let me know how I can help.

Halfak added a parent task: T224484: ORES deployment: Early June.Jun 11 2019, 7:36 PM

Halfak moved this task from Completed to Pending deployment on the Machine-Learning-Team (Active Tasks) board.Jun 17 2019, 4:06 PM

I informed the community. I hope you can deploy this till next week since you have some issues. Best regards!

Actually, we just deployed from our side on Monday. We're now waiting on the Growth-Team to enable the filters in RecentChanges. But if you use Huggle or RTRC, you should be able to see ORES predictions right away.

Halfak moved this task from Pending deployment to Completed on the Machine-Learning-Team (Active Tasks) board.Jun 18 2019, 1:38 PM

Halfak closed this task as Resolved.Jun 18 2019, 1:39 PM

FriedrickMILBarbarossa subscribed.Feb 8 2020, 5:36 AM

Hey! User intent filters are not yet displayed in Recent Changes (screenshot). Since Halfak said we're waiting for the Growth team now, @Trizek-WMF do you know anything about this and when it should be deployed?

In T199355#6213087, @Acamicamacaraca wrote:

Hey! User intent filters are not yet displayed in Recent Changes (screenshot). Since Halfak said we're waiting for the Growth team now, @Trizek-WMF do you know anything about this and when it should be deployed?

Is there a dedicated task about implementing them?

In T199355#6232600, @Trizek-WMF wrote:

In T199355#6213087, @Acamicamacaraca wrote:

Hey! User intent filters are not yet displayed in Recent Changes (screenshot). Since Halfak said we're waiting for the Growth team now, @Trizek-WMF do you know anything about this and when it should be deployed?

Is there a dedicated task about implementing them?

@Trizek-WMF What about T223273?

Kizule moved this task from Backlog to Closed on the Serbian-Sites board.Feb 27 2023, 1:37 PM

Investigate srwiki goodfaith model, why is it so bad?Closed, ResolvedPublicActions

Description

Related ObjectsSearch...

Event Timeline

Investigate srwiki goodfaith model, why is it so bad?
Closed, ResolvedPublic
Actions

Related Objects
Search...