Page MenuHomePhabricator

Deploy Wikidata item quality campaign
Closed, ResolvedPublic

Description

  • discuss sampling strategy
  • generate sample
  • create campaign at labels.wmflabs.org

Event Timeline

Halfak renamed this task from Deploy Item Labeling Campaign to Deploy Wikidata item quality campaign.Feb 7 2017, 9:41 PM
Halfak removed Ladsgroup as the assignee of this task.
Halfak updated the task description. (Show Details)

I think we'll want a stratified sample for the labeling campaign so that we don't end up with 99% of items in the lowest quality strata. We can probably stratify using some basic heuristics, but which ones? # of statements? We can probably get away without labeling any of the showcase items since they have already been reviewed. How many showcase items are there?

halfak@wikilabels-01:~/datasets$ sudo -u www-data /srv/wikilabels/venv/bin/wikilabels new_campaign wikidatawiki "Item quality (5k stratified)" item_quality PrintablePageAsOfRevision 1 10 --config /srv/wikilabels/config/config/
{'active': True, 'labels_per_task': 1, 'tasks_per_assignment': 10, 'view': 'PrintablePageAsOfRevision', 'name': 'Item quality (5k stratified)', 'form': 'item_quality', 'id': 51, 'created': datetime.datetime(2017, 4, 8, 18, 20, 48, 468435), 'wiki': 'wikidatawiki'}
halfak@wikilabels-01:~/datasets$ cat wikidatawiki.stratified_revisions.5k_sample.json | sudo -u www-data /srv/wikilabels/venv/bin/wikilabels task_inserts --config /srv/wikilabels/config/config/ 51

@Glorian_WD updated the sampling strategy so I re-deployed. See http://labels.wmflabs.org/campaigns/wikidatawiki/52/?campaign=stats

halfak@wikilabels-01:~$ sudo -u www-data /srv/wikilabels/venv/bin/wikilabels new_campaign wikidatawiki "Item quality (5k sample)" item_quality PrintablePageAsOfRevision 1 10 --config /srv/wikilabels/config/config/
{'view': 'PrintablePageAsOfRevision', 'wiki': 'wikidatawiki', 'form': 'item_quality', 'labels_per_task': 1, 'created': datetime.datetime(2017, 4, 10, 18, 19, 0, 152014), 'active': True, 'id': 52, 'tasks_per_assignment': 10, 'name': 'Item quality (5k sample)'}
halfak@wikilabels-01:~$ cat datasets/wikidatawiki.stratified_revisions.5k_sample.json | sudo -u www-data /srv/wikilabels/venv/bin/wikilabels task_inserts --config /srv/wikilabels/config/config/ 52

@Halfak : could you briefly explain about the bug?