Experiment with using English Wikipedia models on Simple English
Closed, ResolvedPublic

Description

Once the Simple English ORES models are enabled on the beta cluster, please copy a few samples edits over from simplewiki, both vandalism and good edits. Smoke-test the scores to see if we need to adjust thresholds, and check whether our features are appropriate.

Adotchar created this task.Dec 1 2017, 8:19 PM
Restricted Application added a project: artificial-intelligence. · View Herald TranscriptDec 1 2017, 8:19 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Adotchar removed Adotchar as the assignee of this task.Dec 1 2017, 8:20 PM
Krinkle added a subscriber: Krinkle.Dec 1 2017, 8:20 PM
Halfak renamed this task from Add language support for Simple English to Experiment with using English Wikipedia models on Simple English.Dec 2 2017, 4:46 PM
Halfak updated the task description. (Show Details)
Halfak added a subscriber: Halfak.Dec 2 2017, 4:48 PM

I'm repurposing this task to set up the English Wikipedia models on Simplewiki because I think it is worth a try.

WMFLabs: https://github.com/wiki-ai/ores-wmflabs-deploy/pull/93 (merged)
Prod: https://gerrit.wikimedia.org/r/394759

Change 394759 had a related patch set uploaded (by Halfak; owner: halfak):
[mediawiki/services/ores/deploy@master] Use enwiki models on simplewiki.

https://gerrit.wikimedia.org/r/394759

Halfak added a comment.Dec 2 2017, 5:15 PM

OK so here's what I suggest you do.

  1. Disable the new recent changes filters in your preferences.
  2. Edit your "/common.js" to look like mine: https://simple.wikipedia.org/wiki/User:EpochFail/common.js
  3. Go back to Special:RecentChanges and wait a little bit. ORES should highlight changes that are likely to be damaging.
  4. Tell us how it goes!

I did a little bit of testing and I was able to catch some damaging edits :)

Halfak added a comment.Dec 2 2017, 5:15 PM

BTW, if this works out OK, we'll get ORES enabled for the fancy new recent changes filters too.

OK so here's what I suggest you do.

  1. Disable the new recent changes filters in your preferences.
  2. Edit your "/common.js" to look like mine: https://simple.wikipedia.org/wiki/User:EpochFail/common.js
  3. Go back to Special:RecentChanges and wait a little bit. ORES should highlight changes that are likely to be damaging.
  4. Tell us how it goes!

    I did a little bit of testing and I was able to catch some damaging edits :)

I tested it out a bit. Works fine, most edits it marked were vandalism, except a lot were not. For example, it marked https://simple.wikipedia.org/w/index.php?title=JAY-Z&curid=75975&diff=5906288&oldid=5906287 as vandalism, when it is just the changing of an infobox type, and marked this edit in red: https://simple.wikipedia.org/w/index.php?title=Drake_(entertainer)&curid=210822&diff=5906280&oldid=5905690. So, some work could be done but it did not fail to mark any edits that were vandalism.

Halfak added a comment.Dec 4 2017, 5:14 PM

An important distinction. ORES does not "mark something as Vandalism". Instead, it marks something as "needing review". It's still good to note when it turns out that the review was that the edit was fine. But it's important that you consider the coloring as "there's something that looks funny about this" rather than "there's something wrong with this".

I'm glad to read that it is useful. We'll start moving forward with a deployment.

Halfak added a subscriber: awight.Dec 4 2017, 5:15 PM

@awight, could you look at https://gerrit.wikimedia.org/r/394759 ?

It would be cool if this could go out in a deployment soon.

Change 394759 merged by Awight:
[mediawiki/services/ores/deploy@master] Use enwiki models on simplewiki.

https://gerrit.wikimedia.org/r/394759

Mentioned in SAL (#wikimedia-cloud) [2017-12-04T17:44:32Z] <awight> ORES: Try enwiki models on simplewiki, T181848 (6baed71)

Change 395052 had a related patch set uploaded (by Awight; owner: Awight):
[operations/mediawiki-config@master] Try simplewiki ORES on beta.

https://gerrit.wikimedia.org/r/395052

awight added a comment.Dec 4 2017, 6:11 PM

This change is deployed to the beta service, e.g. https://ores-beta.wmflabs.org/v3/scores/simplewiki/12345

The next steps are to enable the ORES UI and precaching on the beta cluster simplewiki, then if that looks good continue with the production service and config.

Change 395059 had a related patch set uploaded (by Awight; owner: Awight):
[operations/mediawiki-config@master] Enable ORES on simplewiki

https://gerrit.wikimedia.org/r/395059

Change 395052 merged by jenkins-bot:
[operations/mediawiki-config@master] Try simplewiki ORES on beta.

https://gerrit.wikimedia.org/r/395052

Change 395066 had a related patch set uploaded (by Awight; owner: Awight):
[operations/mediawiki-config@master] Add ORES filter thresholds for simplewiki

https://gerrit.wikimedia.org/r/395066

Change 395066 merged by jenkins-bot:
[operations/mediawiki-config@master] Add ORES filter thresholds for simplewiki

https://gerrit.wikimedia.org/r/395066

awight added a comment.Dec 4 2017, 7:21 PM

This is on the beta wiki, but I'm not going to proceed further today because something's missing:
https://simple.wikipedia.beta.wmflabs.org/wiki/Special:RecentChanges

I think that scores aren't being cached into the MediaWiki database yet? OH, we probably have to run a database migration?

awight added a comment.Dec 5 2017, 2:19 PM

Ran into an issue:

This test change,
https://simple.wikipedia.beta.wmflabs.org/w/index.php?diff=3266888

Cannot be found by the extractor,
http://ores-beta.wmflabs.org/v3/scores/simplewiki/?models=damaging%7Cgoodfaith&revids=3266888&precache=true&format=json

{"simplewiki": {"models": {"damaging": {"version": "0.4.0"}, "goodfaith": {"version": "0.4.0"}}, "scores": {"3266888": {"damaging": {"error": {"message": "RevisionNotFound: Could not find revision ({revision}:3266888)", "type": "RevisionNotFound"}}, "goodfaith": {"error": {"message": "RevisionNotFound: Could not find revision ({revision}:3266888)", "type": "RevisionNotFound"}}}}}}

Halfak added a comment.Dec 5 2017, 3:09 PM

https://simple.wikipedia.org/w/index.php?diff=3266888 doesn't exist. It's trying to score the revision *on* Simple English wiki.

awight added a comment.Dec 5 2017, 3:24 PM

Aha, thanks!

On to the next puzzle. All four thresholds were appearing yesterday, but today only one appears on Special:RecentChanges,
https://simple.wikipedia.beta.wmflabs.org/wiki/Special:RecentChanges

The API response is correct for http://ores-beta.wmflabs.org/v3/scores/simplewiki/?models=damaging&model_info=statistics.thresholds.false.%22maximum+recall+%40+precision+%3E%3D+0.995%22%7Cstatistics.thresholds.true.%22maximum+filter_rate+%40+recall+%3E%3D+0.9%22%7Cstatistics.thresholds.true.%22maximum+recall+%40+precision+%3E%3D+0.6%22%7Cstatistics.thresholds.true.%22maximum+recall+%40+precision+%3E%3D+0.9%22&format=json

{"simplewiki": {"models": {"damaging": {"statistics": {"thresholds": {"false": [{"!f1": 0.236, "!precision": 0.136, "!recall": 0.887, "accuracy": 0.804, "f1": 0.888, "filter_rate": 0.222, "fpr": 0.113, "match_rate": 0.778, "precision": 0.995, "recall": 0.801, "threshold": 0.899}], "true": [{"!f1": 0.881, "!precision": 0.996, "!recall": 0.79, "accuracy": 0.794, "f1": 0.23, "filter_rate": 0.767, "fpr": 0.21, "match_rate": 0.233, "precision": 0.132, "recall": 0.901, "threshold": 0.091}, {"!f1": 0.984, "!precision": 0.973, "!recall": 0.995, "accuracy": 0.969, "f1": 0.329, "filter_rate": 0.988, "fpr": 0.005, "match_rate": 0.012, "precision": 0.62, "recall": 0.224, "threshold": 0.769}, {"!f1": 0.983, "!precision": 0.967, "!recall": 1.0, "accuracy": 0.967, "f1": 0.061, "filter_rate": 0.999, "fpr": 0.0, "match_rate": 0.001, "precision": 0.913, "recall": 0.032, "threshold": 0.941}]}}}}}}

Cache contents show the correct values!

Catrope added a subscriber: Catrope.Dec 6 2017, 7:43 PM

Right now I'm not seeing the ORES filters in RC at all on simplewiki in labs, despite https://gerrit.wikimedia.org/r/395066.

awight claimed this task.Dec 20 2017, 8:04 PM

Now throwing a stack trace that no goodfaith model exists in the database...

I ran CheckModelVersions manually, which brought the database models back. I think that's not on a cronjob, so I'll add it to the "new model checklist".

Change 399464 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/ORES@master] Don't double-quote model version

https://gerrit.wikimedia.org/r/399464

awight removed awight as the assignee of this task.Dec 20 2017, 9:02 PM

@Catrope This is unstalled and ready for testing on the beta cluster. Would you like to own the rest of the config + deployment, since you have a patch ready?

@Catrope I should have read the title of the task... this is ours for a while longer. We need to poke at the data and see if a model built for enwiki is valid on simplewiki, since it's the first time we've tried such boldness.

Oh this is done. It has been reviewed by @Adotchar

Change 399464 merged by jenkins-bot:
[mediawiki/extensions/ORES@master] Don't double-quote model version

https://gerrit.wikimedia.org/r/399464

Oh this is done. It has been reviewed by @Adotchar

@Halfak @Adotchar can either of you confirm that this task can be closed? I remember some confusion on IRC when Aaron's comment went through. I would assume that experimentation would have to be done on the beta cluster, which doesn't have the models enabled yet, so unsure how this task would be completed. I'll update the task description per my understanding, please edit if I'm off-base.

awight updated the task description. (Show Details)Jan 18 2018, 7:48 PM

Oh this is done. It has been reviewed by @Adotchar

@Halfak @Adotchar can either of you confirm that this task can be closed? I remember some confusion on IRC when Aaron's comment went through. I would assume that experimentation would have to be done on the beta cluster, which doesn't have the models enabled yet, so unsure how this task would be completed. I'll update the task description per my understanding, please edit if I'm off-base.

I’ve been testing this for a few weeks. English models work perfectly.

awight closed this task as Resolved.Jan 18 2018, 7:50 PM
awight claimed this task.

Thanks for the confirmation!

Change 395059 abandoned by Awight:
Enable ORES on simplewiki

https://gerrit.wikimedia.org/r/395059