Currently, we follow the following process:
- Randomly sample ~20k revisions
- Filter our edits by trusted users to get down to ~5k revisions
- Users label the edits
When doing a second campaign, we should use the model trained on the first campaign's data to filter the edits to an even smaller set.
- Randomly sample ~20k revisions
- Filter our edits by trusted users to get down to ~5k revisions
- Filter edits that are obviously not damaging and saved in goodfaith using the old model (down to ~1 or 2k?)
- Users label the edits
This will probably belong in editquality autolabel.