Page MenuHomePhabricator

System to get a better sample set for edit quality campaigns
Closed, ResolvedPublic

Description

So when we gather 20,000 revisions we ended up with too few uncertain revisions in fawiki due to the heavier involvement of bots & good users on this wiki.

  1. Have a sample of something like 50,000 random revisions
  2. Auto label the 50,000 random sample so we are left with 50,000 - x many auto labeled revisions and x many revisions for wikilabels.
  3. Have a random resample of 18,000 auto labelled revisons and 2,000 for wiki labels

I want this to be handled independent of quarry because I'd rather we do not micro manage this any longer. We are getting more languages. This should however have some sort of a config file so that we can fine tune this as needed for other stuff

Event Timeline

ToAruShiroiNeko claimed this task.
ToAruShiroiNeko raised the priority of this task from to Low.
ToAruShiroiNeko updated the task description. (Show Details)
ToAruShiroiNeko subscribed.
Halfak subscribed.

We now do this by ensuring we get at least 2.5k revisions that "need review" and matching them with 2.5k revisions that don't.