Page MenuHomePhabricator

Edit quality campaign for Albanian Wikipedia
Closed, ResolvedPublic

Description

Hi @Halfak, not sure about the workflow of this this. Can you guide me?

Thanks!

  • Confirm translations are ready
  • List of trusted user groups
  • Translate "Edit quality (20k sample)"
  • Run prelabeling script
  • Load revisions into labels.wmflabs.org

Event Timeline

Liuxinyu970226 renamed this task from Edit quality campaign for SQ to Edit quality campaign for Albanian Wikipedia.Apr 2 2017, 2:24 PM

@Liuxinyu970226, it looks like we don't have translations for sq yet in the labeling interface, so users will just see English.

If you want to get things translated, check out https://translatewiki.net. Specifically, we'll need translations for the basic interface and the damaging_and_goodfaith form.

We'll also need a list of "trusted user groups". These user groups are only given to users who are highly trusted. We'll filter their edits out of the labeling system so we don't waste people's time reviewing good work. For English Wikipedia, we use

  • sysop
  • oversight
  • bot
  • rollbacker
  • checkuser
  • abusefilter
  • bureaucrat

One more thing. Can you provide a Albanian translation of "Edit quality (20k sample)"? We'll use this as the title of the edit quality labeling campaign.

Halfak triaged this task as Medium priority.Apr 13 2017, 3:07 PM
Halfak moved this task from Unsorted to Blocked on community input on the Machine-Learning-Team board.

Translations mentioned are done.

Trusted user groups for sq.wikipedia:

  • sysop
  • bureaucrats
  • bot

"Edit quality (20k sample)" - Cilësia e redaktimit (mostra 20k)

Thanks!

Or if possible, do it as in en.wikipedia so we are future proof.

ok! Thanks for getting back to me so quickly. I'll have this online and ready soon :)

(3.4)halfak@ores-compute-01:~/projects/editquality$ cat datasets/sqwiki.autolabeled_revisions.20k_2016.json | json2tsv reverted_for_damage | sort | uniq -c 
  19637 False
    361 True
(3.4)halfak@ores-compute-01:~/projects/editquality$ cat datasets/sqwiki.autolabeled_revisions.20k_2016.json | json2tsv autolabel.needs_review | sort | uniq -c 
  15435 False
   4563 True

Looks like we'll be labeling 4563 edits. That's a pretty good number. 361 look like they were reverted because they are damaging (best guess). I'll be uploading this to Wikilabels shortly.

Halfak updated the task description. (Show Details)
halfak@wikilabels-01:~/datasets$ sudo -u www-data /srv/wikilabels/venv/bin/wikilabels new_campaign sqwiki "Cilësia e redaktimit (mostra 20k)" damaging_and_goodfaith DiffToPrevious 1 50 --config /srv/wikilabels/config/config/
{'form': 'damaging_and_goodfaith', 'tasks_per_assignment': 50, 'wiki': 'sqwiki', 'view': 'DiffToPrevious', 'active': True, 'labels_per_task': 1, 'created': datetime.datetime(2017, 4, 14, 16, 14, 37, 726499), 'id': 57, 'name': 'Cilësia e redaktimit (mostra 20k)'}
halfak@wikilabels-01:~/datasets$ cat sqwiki.autolabeled_revisions.20k_2016.json | grep '"needs_review": true' | wc # | sudo -u www-data /srv/wikilabels/venv/bin/wikilabels task_inserts --config /srv/wikilabels/config/config/ 57
   4563   41627  511497
halfak@wikilabels-01:~/datasets$ cat sqwiki.autolabeled_revisions.20k_2016.json | grep '"needs_review": true' | sudo -u www-data /srv/wikilabels/venv/bin/wikilabels task_inserts --config /srv/wikilabels/config/config/ 57

You can see the labeling interface at: http://labels.wmflabs.org/ui/sqwiki/

Stats are here: http://labels.wmflabs.org/campaigns/sqwiki/?campaigns=stats

All looks good. The translations aren't up yet, but I'll be deploying them next.