Semi-supervised, supervised learning for 2nd edit quality campaigns
Open, LowPublic
Actions

Assigned To

None

Authored By

	Halfak
	Jun 1 2017, 2:13 PM

Description

Currently, we follow the following process:

When doing a second campaign, we should use the model trained on the first campaign's data to filter the edits to an even smaller set.

Randomly sample ~20k revisions
Filter our edits by trusted users to get down to ~5k revisions
Filter edits that are obviously not damaging and saved in goodfaith using the old model (down to ~1 or 2k?)
Users label the edits

This will probably belong in editquality autolabel.

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Halfak triaged this task as Low priority.Jun 1 2017, 2:13 PM

Is there a lower bound on "first" model health, beyond which this might be inappropriate?