Page MenuHomePhabricator

Update Turkish Wikipedia's labeling campaign for 2020
Closed, ResolvedPublic

Description

The old 2016 campaign hasn't seen much activity. Let's update the campaign with a sample from 2020 and have people label it.

Make sure to ping back on the original discussion once this is done. See https://meta.wikimedia.org/wiki/User_talk:Halfak_(WMF)#ORES_in_Turkish

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Halfak triaged this task as Medium priority.Jul 13 2020, 4:36 PM
Halfak moved this task from Unsorted to New development on the Machine-Learning-Team board.
Halfak added a subscriber: calbon.

@calbon this is the Wikilabels task we talked about at backlog grooming.

@Evrifaessa, I'd moved to a new job, so I'm not managing the backlog for ORES anymore. For the now, @calbon is responsible for prioritizing tasks like this one.

I'm happy to help out when @calbon is ready to take this on though.

@Evrifaessa Heyo! I have taken over the team. Right now with Aaron leaving and other things we are down to Kevin and myself. But when some folks start again we can tackle this and other items in the backlog.

Hi @calbon

We are actively trying to develop a counter-vandalism bot at the moment, so it would be of tremendous help to us if this task could be given some precedence / priority in the backlog.

Here's a query that gathers a random sample of 20k revisions from the last year: https://quarry.wmflabs.org/query/47980

Next step is to pull the results of this query into the https://github.com/wikimedia/editquality Makefile and run the autolabel utility. If the # of revisions that "need_review": True is between 2k and 5k, we're good to go. Load those up into Wikilabels and we can go from there. If the number of needs_review is too low or too high, we need to adjust the size of the incoming sample.

@Vito-Genovese Sounds good, I am going to put it to the front of the backlog and try to get to it this week or early next week.

@Halfak Thank you so much for your help as always.

@Halfak I followed these instructions and took the following steps;

  1. Got qrun_id: 495204 from the HTML source code of the query that you shared.
  2. Created this link that enables editquality to download the resultset programmatically.
  3. Replaced old resultset with new resultset in the Makefile.
  4. Renamed config trwiki datasets from *20k_2015 to *20k_2019 in the Makefile.

Based on the instructions, the next step would have been to run the autolabel utility. I couldn't run it because in the Makefile, I see the autolabel utility for other wikis but there's none for trwiki.

Please let me know how I can move forward from this point.

Thanks!

Nice work on the progress @kevinbazira!

I think a good next step to help us work together would be to get a PR up for the changes you are making in the Makefile. It'll be easier to reference parts of the Makefile that way.

So, trwiki was one of the first wikis we supported so it's a bit weird. We'll want to borrow some configuration from the other wikis to see what to do here. In this case, fawiki is a good example to work from because, like trwiki, it has two separate labeling campaigns. Also, fawiki is one of the first wikis we worked on so it's config is weird in the same way! See https://github.com/wikimedia/editquality/blob/master/config/wikis/fawiki.yaml

In that config, you can see that only the 2016 (newer sample for fawiki) gets the "autolabeled_samples" treatment.

Based on this query I can see that the "trusted_groups" should include:

- sysop
- oversight
- bot
- rollbacker
- checkuser
- abusefilter
- bureaucrat
- flow-bot
- interface-admin
- interface-editor

If that doesn't make the next step clear, get what you have in a PR and we'll take it from there :)

@Halfak, wdym by trusted_groups? what is it used for? and why didn't you include the patroller group in that list?

@Evrifaessa

"trusted_groups" are user groups of users who we don't want to waste your time asking you to review. E.g., we can be reasonably sure that admins aren't vandalizing Wikipedia. Is that true for people who are given the Patroller right? Either way, we'll be asking you to review any edits by editors in these "trusted_groups" that were reverted just in case there was some unintentional damage involved.

@kevinbazira

PR is in a good state. I just left the second round of notes. We should add "patroller" to the list of trusted groups if @Evrifaessa confirms that there's enough of a barrier to becoming a patroller that we can trust people with these rights.

Next we'll want to load the "needs_review": true data into Wikilabels. How many revisions in the sample contain that data value? You should be able to find out with some CLI fu like this: cat datasets/trwiki.autolabeled_revisions.20k_2020.json | grep '"needs_review": true' | wc -l

We'll want to pull the data into Wikilabels.

@Halfak, we do not have a rollbacker group, so it'll be nonsense to add rollbackers to trusted_groups since we don't have any. Well, I don't see anything bad in including patrollers instead of rollbackers, because the patroller group kinda includes what rollbacker group does. Either way, in my very personal opinion, we can include patrollers in trusted_groups. I'd also like to see @Vito-Genovese's opinions for this.

Indeed. The Patroller user group at trwiki is a combination of 1) the original Patroller user group (introduced by Extension:Patroller), 2) the Editor user group (introduced by Extension:FlaggedRevs), and 3) the rollback right. So, they are certainly trustworthy.

@calbon, I created this PR that added 2020 trwiki data configurations to editquality.

The next step was to go to Wikilabels and Aaron who was advising on this has been traveling lately as he communicated at the end of the PR above.

We hope to proceed soon as he is available.

Sorry for the delay. Just drove across a continent and I'm moving into a new house! I should be able to get back to supporting this task next week.

@Halfak , have you been able to get to this one?

Thanks for the ping. I do still have this on my todo list and I should be able to give Kevin the stuff he needs to get it done with week.

As promised, I've loaded the new labeling campaign and I've included a summary of the actions I performed below for documenting this process.

First, let's connect to the wikilabels production VM in labs:

$ ssh wikilabels-02.eqiad.wmflabs

Make a backup of the database just in case you make a mistake.

halfak@wikilabels-02:~$ cd backups/
halfak@wikilabels-02:~/backups$ pg_dump -d u_wikilabels -U u_wikilabels -h wikilabels.db.svc.eqiad.wmflabs -W | gzip -c > ../backups/2020-10-21.sql.gz
Password:

You'll need the password for the database in order to perform this action. I get it from the local config. Note that I have censored the password below. You'll be able to see it when you are on the machine.

halfak@wikilabels-02:~/backups$ cat /srv/wikilabels/config/config/98-database.yaml 
# These credentials are intended to be used on labels.wmflabs.org.  They are
# sensitive and should never be commited to a public repository.
database:
  user: u_wikilabels
  dbname: u_wikilabels
  password: <password>

Now, let's create a new campaign. I run the script from the production code and using the production config. The new_campaign script does the heavy lifting.

halfak@wikilabels-02:~/backups$ sudo -u www-data /srv/wikilabels/venv/bin/wikilabels new_campaign trwiki "Değişiklik kalitesi (3,000 rastgele örnekleme, 2020)" damaging_and_goodfaith DiffToPrevious 1 50  --config=/srv/wikilabels/config/config/
{'active': True, 'created': datetime.datetime(2020, 10, 21, 14, 38, 51, 290336), 'tasks_per_assignment': 50, 'view': 'DiffToPrevious', 'name': 'Değişiklik kalitesi (3,000 rastgele örnekleme, 2020)', 'info_url': None, 'id': 96, 'form': 'damaging_and_goodfaith', 'labels_per_task': 1, 'wiki': 'trwiki'}

You can see that the output contains structured information about the campaign that was just created. The most important bit of information we need is the campaign ID. Here, you see 'id': 96. In this next command, we use the task_inserts script to insert all of the observations that contain "needs_review": true.

halfak@wikilabels-02:~/backups$ cat ../datasets/trwiki.autolabeled_revisions.20k_2020.json | grep '"needs_review": true' | sudo -u www-data /srv/wikilabels/venv/bin/wikilabels task_inserts 96 --config=/srv/wikilabels/config/config/

Now we're done! You can check our work at https://labels.wmflabs.org/ui/trwiki/

Hi @Halfak
Thanks for everything. ORES would be a huge time saver for us.

I started labeling but it shows edits of sysops. I thought we excluded sysops and patrollers.

Woops! That's definitely not right. I'm guessing there was a step in the script that didn't work as intended. I might need to re-work the sample, so hold off on labeling until we have found the issue, OK?

AHa! I figured out what is going on. It turns out that specific edit was reverted. We flag sysop edits for review when they are reverted just in case something weird was going on (e.g. a damaging mistake that was made in good faith). So all may be fine. Let me know if you see other examples and I'll dig into them

AHa! I figured out what is going on. It turns out that specific edit was reverted. We flag sysop edits for review when they are reverted just in case something weird was going on (e.g. a damaging mistake that was made in good faith). So all may be fine. Let me know if you see other examples and I'll dig into them

The labeling is complete (well, except 1 label that can't be reached for a probably technical reason :)).

What should we do next?

Fantastic! I can work with the data you have provided to update the model. I'll try to get that work in soon but as I'm just a volunteer with a new baby, I can't give you any guarantees on when I'll be able to get to it. But a week or two seems likely at this point.

Thanks, Aaron. Congrats on the baby btw!

I've finally got the deployment of ORES unblocked. That was a surprising large amount of work to get things cleaned up. We're now blocked on getting this to production before we can get retrained Turkish models out. See T278723: ORES deployment - Spring 2021.

I've dropped an email to @calbon to ask him to raise the review priority. I'll report back on when this blocker is cleared.

This change will look a lot like this work for ptwiki: https://github.com/wikimedia/editquality/pull/225/files

Regretfully, the diff for that pull request is messy because it didn't contain the model in LFS so I had to force-push some changes. The most relevant change can be seen in the ptwiki.yaml configuration.

In order to complete this work, the trwiki.yaml file will need to be updated to include the new campaign. Then the makefile will need to be updated with ./utility generate_make > Makefile.

Then you'd run make ptwiki_models to start the model building process. It's best to run this command on ores-misc01.eqiad.wmflabs so we'll need to get folks access.

I seem not to have access to ores-misc01.eqiad.wmflabs

This is what I run:

kbazira@kbazira:~$ ssh ores-misc01.eqiad.wmflabs
channel 0: open failed: administratively prohibited: open failed
stdio forwarding failed
ssh_exchange_identification: Connection closed by remote host

Let me work on getting access to the server after which, I'll proceed to complete this work.

Thanks to @elukey, I was able to get access using both ssh ores-misc-01.ores-staging.eqiad1.wikimedia.cloud and ssh ores-misc-01.eqiad.wmflabs.

I have updated the trwiki.yaml and generated a new makefile. PR: https://github.com/wikimedia/editquality/pull/233

Whenever I run the model building process it fails as seen below;

(venv) kevinbazira@ores-misc-01:~/editquality$ make trwiki_models
cat datasets/trwiki.labeled_revisions.23k_2015_2020.json | \
revscoring extract \
	editquality.feature_lists.trwiki.damaging \
	editquality.feature_lists.trwiki.goodfaith \
	--host https://tr.wikipedia.org \
	--extractors 4 \
	--verbose > datasets/trwiki.labeled_revisions.w_cache.23k_2015_2020.json
 35%|██████████████████████████████████▋                                                                 | 7974/22977 [04:43<11:07, 22.49it/s]Traceback (most recent call last):
  File "/home/kevinbazira/editquality/venv/bin/revscoring", line 8, in <module>
    sys.exit(main())
  File "/home/kevinbazira/editquality/venv/lib/python3.5/site-packages/revscoring/revscoring.py", line 53, in main
    module.main(sys.argv[2:])
  File "/home/kevinbazira/editquality/venv/lib/python3.5/site-packages/revscoring/utilities/extract.py", line 111, in main
    profile_f, verbose, debug)
  File "/home/kevinbazira/editquality/venv/lib/python3.5/site-packages/revscoring/utilities/extract.py", line 143, in run
    dump_observation(observation, output)
  File "/home/kevinbazira/editquality/venv/lib/python3.5/site-packages/revscoring/utilities/util.py", line 37, in dump_observation
    json.dump(observation, f)
  File "/usr/lib/python3.5/json/__init__.py", line 179, in dump
    fp.write(chunk)
OSError: [Errno 28] No space left on device
Makefile:3723: recipe for target 'datasets/trwiki.labeled_revisions.w_cache.23k_2015_2020.json' failed
make: *** [datasets/trwiki.labeled_revisions.w_cache.23k_2015_2020.json] Error 1
make: *** Deleting file 'datasets/trwiki.labeled_revisions.w_cache.23k_2015_2020.json'

I've checked both disk space and inodes, there seems to be space available;

(venv) kevinbazira@ores-misc-01:~/editquality$ df -h
Filesystem                    Size  Used Avail Use% Mounted on
udev                           18G     0   18G   0% /dev
tmpfs                         3.6G   20M  3.6G   1% /run
/dev/vda3                      19G   18G   49M 100% /
tmpfs                          18G     0   18G   0% /dev/shm
tmpfs                         5.0M     0  5.0M   0% /run/lock
tmpfs                          18G     0   18G   0% /sys/fs/cgroup
/dev/mapper/vd-srv             60G   38G   19G  68% /srv
labstore1006.wikimedia.org:/   98T   72T   22T  77% /mnt/nfs/dumps-labstore1006.wikimedia.org
labstore1007.wikimedia.org:/   97T   77T   16T  84% /mnt/nfs/dumps-labstore1007.wikimedia.org
tmpfs                         3.6G     0  3.6G   0% /run/user/21773

...

(venv) kevinbazira@ores-misc-01:~/editquality$ df -i
Filesystem                       Inodes   IUsed      IFree IUse% Mounted on
udev                            4630350     344    4630006    1% /dev
tmpfs                           4633787    1146    4632641    1% /run
/dev/vda3                       1245184  319512     925672   26% /
tmpfs                           4633787       1    4633786    1% /dev/shm
tmpfs                           4633787       3    4633784    1% /run/lock
tmpfs                           4633787      16    4633771    1% /sys/fs/cgroup
/dev/mapper/vd-srv              3964928   39895    3925033    2% /srv
labstore1006.wikimedia.org:/ 1648263168 2384681 1645878487    1% /mnt/nfs/dumps-labstore1006.wikimedia.org
labstore1007.wikimedia.org:/ 1631485952 2384534 1629101418    1% /mnt/nfs/dumps-labstore1007.wikimedia.org
tmpfs                           4633787      11    4633776    1% /run/user/21773

@Halfak, am I missing something?

@kevinbazira on ores-misc-01 the root partition is full :(

(venv) kevinbazira@ores-misc-01:~/editquality$ df -h
Filesystem                    Size  Used Avail Use% Mounted on
udev                           18G     0   18G   0% /dev
tmpfs                         3.6G   20M  3.6G   1% /run
/dev/vda3                      19G   18G   49M 100% /  <====

The home dirs are used a lot, there may be not a lot of space to run that command from your home dir:

elukey@ores-misc-01:/home$ sudo du -hs * | sort -h
16K	elukey
20K	andrew
24K	codezee
484M	dibyaaaaax
537M	shgcdr07
621M	chrisalbon
692M	halfak
866M	awight
1.3G	kevinbazira
1.3G	ladsgroup
1.5G	chtnnh
1.5G	he7d3r
2.6G	haks
3.3G	pavol86

The /srv partition may be an alternative, what if we try to:

  • sudo mkdir /srv/kevinbazira
  • sudo chown kevinbazira:kevinbazira /srv/kevinbazira
  • cp -r /home/kevinbazira/editquality /srv/kevinbazira
  • cd /srv/kevinbazira/editquality
  • make trwiki_models ...

Before that it would be good to clean up your home dir so we get some free space in the root partition (and the other users should do the same :D ).

Let me know if it makes sense!

Thank you so much @elukey, I moved my work to /srv/home/kevinbazira and cleared /home/kevinbazira/. This did the trick.

@Halfak, I have updated the PR with the new models.

I left some comments on the PR. There's just some practical data-work to do. I hope my instructions are clear enough. There's also a weird travis failure that I hope to look into. It will require running python 3.5 locally. I believe python 3.5 is the default python3 on ores-misc-01 so that should make things easy.

Thank you for the review @Halfak, I have updated the PR with the suggestions you made.

Looks good. I made a parallel PR to fix a weird travis issue that was preventing your tests from passing. If you merge https://github.com/wikimedia/editquality/pull/234 and rebase, I think the tests will pass then.

Change 691509 had a related patch set uploaded (by Elukey; author: Elukey):

[mediawiki/services/ores/deploy@master] Bump editquality's submodule sha to its latest version

https://gerrit.wikimedia.org/r/691509

Change 691509 merged by Elukey:

[mediawiki/services/ores/deploy@master] Bump editquality's submodule sha to its latest version

https://gerrit.wikimedia.org/r/691509

Deployed to beta, https://ores-beta.wmflabs.org/v3/scores/trwiki/123 seems to work. Nothing weird reported in celery/uwsgi on the beta host.

@Halfak on paper we'd be ready to deploy to production, but if you could double check the new models first it would be great :)

@Evrifaessa @Vito-Genovese @Mavrikant hi! The new models should be ready to deploy, just wanted to give you an heads up before proceeding :)

Mentioned in SAL (#wikimedia-operations) [2021-05-17T15:06:15Z] <elukey@deploy1002> Started deploy [ores/deploy@3e1ff5f]: Update editquality submodule after Turkish Wikipedia's labelling campain - T257359

Mentioned in SAL (#wikimedia-operations) [2021-05-17T15:26:03Z] <elukey@deploy1002> Finished deploy [ores/deploy@3e1ff5f]: Update editquality submodule after Turkish Wikipedia's labelling campain - T257359 (duration: 19m 48s)

I just had a chance to take a look at this. Sorry I was AFK this weekend. But all looks good to me.

Folks following this ticket should expect to see a small but noticeable improvement to the performance of the models. The primary effect should be that fewer non-damaging edits will be flagged for review -- saving patrollers some time.

Thanks for the deployment. Does this instantly affect the scoring interface?

Thanks all for getting this deployment out!

Pppery removed a project: Patch-For-Review.
Pppery added a subscriber: Pppery.

Assuming the above comments mean this is resolved.