Page MenuHomePhabricator

Re-label huwiki damaging and badfaith edits
Closed, ResolvedPublic


It seems to be a common theme these days. We should probably have a good process for this. But for now, let's just give it a try.

Let's create a campaign for labeling the target edits of huwiki again.

It would be great if we could have a way to make sure that the second label comes from someone other than the first labeler. But that might need to be future work.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

As long as the size of the edit set is significantly smaller than the original one (hundreds), we can just ask the people who have reviewed more than, say, 100 edits to not participate this time, to make this training set mostly independent.

Harej triaged this task as Medium priority.Jun 4 2019, 9:14 PM
Harej moved this task from Unsorted to New development on the Machine-Learning-Team board.

@Halfak can I help move this along? If you can specify the conditions and the expected format, I can produce a dataset.

I created a new campaign with 500 items marked damaging or badfaith. See

I named the campaign "Edit quality re-label (500)". I can update that with a translated string if you can give me the text.

Thanks @Halfak! "Edit quality re-label" is "Szerkesztési minőség újraosztályozása".

Since we want to avoid the same people dominating the labeling, we depend on T223899: Information about finished campaigns should be accessible in Wikilabels or some other way of getting a list of top labelers.

OK the name is updated. We have a change that should allow the finished campaigns to be visible. I'll work on a deployment.

New deployment is out. You can see stats on the old campaign here:

@Halfak seems like something is wrong with wikilabels, after creating a new workset the request for workset contents (<user>/<workset>/?tasks=&campaign=) returns tasks: [] so wikilabels shows a 0/0 progress bar and throws Could not select task. Index 0 out of bounds. at TaskList.selectByIndex on the JS console and Nem sikerült betölteni a(z) „damaging_and_goodfaith” űrlapot: $2 gets disoplayed to the user. Several other users have reproduced this (but not everyone who tried). Should I make a separate bug report?

@Halfak labeling is now mostly done (474 out of 500), all the remaining items need privileged access, and none of the admins are able to use the labeling tool due to the aforementioned bug (which seems to only affect a subset of the users).

OK that's great. We can work with this.

Are there any new information/update?

We should have the updated model deployed early this week. Sorry for the delay! We had a couple minor hiccups last week with this deployment because there's a few other things going out with it.