3 categories labels system (multi-class classification)
Closed, DeclinedPublic
Actions

Assigned To

None

Authored By

	Alchimista
	Aug 7 2015, 10:18 AM

Description

Right now has far as i can see, edits are being classified as vandalism or good faith. Seems interesting to add a new category, testing, for those edits made by newbies who aren't vandalisms, but bad edits made in good faith. Applied to bots, would allow a less intimidating message, since many of them specially when it's a mistake made in good faith loose some of the interest. Despite that, this classification would be theoretically more accurate, with situations that fall in a gray area between vandalism and good faith, and in most cases heuristics could easily give a hand to improve labeling and scoring.

Related Objects
Search...

		Status	Subtype	Assigned	Task
		Declined		None	T108304 3 categories labels system (multi-class classification)
		Resolved		Halfak	T108679 Train/deploy "damaging" and "goodfaith" models from data collected through (finished) "edit quality" campaigns

Event Timeline

Alchimista created this task.Aug 7 2015, 10:18 AM

Alchimista raised the priority of this task from to Needs Triage.

Alchimista updated the task description. (Show Details)

Alchimista added a project: Machine-Learning-Team (Active Tasks).

Alchimista subscribed.

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 7 2015, 10:18 AM

Alchimista mentioned this in T108305: Integrate revscoring and/or wikilabels into Huggle.Aug 7 2015, 10:21 AM

Actually, since each edit was labeled according to two orthogonal aspects, we already have four possible classes for each edit:

A good faith edit, which unfortunately damages the page (inexperienced user trying to help)
A good faith edit, which improves the page (or at least does not makes it worse)
A bad faith edit, which damages the page (aka vandalism)
A bad faith edit, which improves the page (impossible?! non-vandal by accident? labeling error?)
- See also comments on https://en.wikipedia.org/wiki/Wikipedia_talk:Labels/Edit_quality#Feedback

Therefore, I think this feature request is already implemented: you just need to combine the two categories, and look for edits with high probability of being goodfaith, and low probability of being damaging to find out which are the good contributions. Similarly for bad edits made in good faith.

He7d3r renamed this task from 3 categorys labels system to 3 categories labels system (multi-class classification).Aug 7 2015, 4:01 PM

He7d3r set Security to None.

He7d3r added a subtask: T108679: Train/deploy "damaging" and "goodfaith" models from data collected through (finished) "edit quality" campaigns.Aug 11 2015, 12:18 PM

+1 I'll be training models of "good-faith" and "damaging" first. We can reconsider a 3 class model if that proves insufficient.

Halfak closed subtask T108679: Train/deploy "damaging" and "goodfaith" models from data collected through (finished) "edit quality" campaigns as Resolved.Nov 19 2015, 11:39 PM

Restricted Application added a subscriber: StudiesWorld. · View Herald TranscriptNov 19 2015, 11:39 PM

Halfak edited projects, added Machine-Learning-Team; removed Machine-Learning-Team (Active Tasks).Mar 30 2016, 2:55 PM

Halfak added projects: revscoring, editquality-modeling.Mar 30 2016, 5:08 PM

Halfak moved this task from Unsorted to Ideas on the Machine-Learning-Team board.

It looks like the "goodfaith" and "damaging" models fit this use-case. Please re-open if that proves insufficient.

3 categories labels system (multi-class classification)Closed, DeclinedPublicActions

Description

Related ObjectsSearch...

Event Timeline

3 categories labels system (multi-class classification)
Closed, DeclinedPublic
Actions

Related Objects
Search...