Page MenuHomePhabricator

[research] Why is the japanese 'reverted' model so bad?
Open, LowPublic

Description

See https://github.com/wiki-ai/editquality/pull/24

A logistic regression fits better than RF or GB. Why? What's going on here?

Event Timeline

Halfak created this task.Apr 22 2016, 4:47 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 22 2016, 4:47 PM

Here's my first results.

  • Number of reverted edits are super small (382) which means we are prone to overfitting.

Let me increase number of edits to 40K increase radius to 7 revision. We might get something.

Ladsgroup added a comment.EditedMay 21 2016, 1:04 PM

With 40K edits:

ScikitLearnClassifier
 - type: GradientBoosting
 - params: balanced_sample=false, max_leaf_nodes=null, center=true, presort="auto", learning_rate=0.01, init=null, verbose=0, min_samples_leaf=1, n_estimators=700, max_features="log2", balanced_sample_weight=true, random_state=null, scale=true, max_depth=7, warm_start=false, loss="deviance", min_samples_split=2, subsample=1.0, min_weight_fraction_leaf=0.0
 - version: 0.0.1
 - trained: 2016-05-21T13:00:24.789069

Table:
                 ~False    ~True
        -----  --------  -------
        False      6710     1055
        True         99       97

Accuracy: 0.855
Precision: 0.084
Recall: 0.495
PR-AUC: 0.122
ROC-AUC: 0.804
Recall @ 0.1 false-positive rate: threshold=0.963, recall=0.005, fpr=0.0
Filter rate @ 0.9 recall: threshold=0.129, filter_rate=0.505, recall=0.903
Filter rate @ 0.75 recall: threshold=0.294, filter_rate=0.708, recall=0.75

It sounds much better. Let's see if we need to do other stuff

Ladsgroup added a comment.EditedMay 21 2016, 1:09 PM

Another thing I learned from Japanese Wikipedia is that number of unregistered users doing good edit is much more than other wikis. Just checkout their RC. Causing features such as user age loose their predictive value. And since we can't get much signal from Japanese text (no dict words, etc.), Our models won't as good as we want unless we add features exclusively for Japanese Wikipedia/Japanese language.

P3161 and P3162 are cases that a native Japanese speaker can help.

@Ladsgroup, Can you please put these on a wiki and use [[:ja:Special:Diff/...]] to make them linked nicely?

He7d3r added a subscriber: He7d3r.May 24 2016, 2:03 PM

In case it helps, I copied them to m:Meta:Sandbox.

Ladsgroup removed Ladsgroup as the assignee of this task.May 25 2016, 9:55 PM
Ladsgroup added a subscriber: Ladsgroup.
Halfak triaged this task as Low priority.Jul 5 2016, 2:34 PM
Whatamidoing-WMF added a subscriber: Whatamidoing-WMF.

@Miya, do you have any interest in teaching a computer to identify "bad" edits at the Japanese Wikipedia? Do you know anyone at jawiki who might be interested?

Elitre added a subscriber: Elitre.Jul 10 2016, 7:48 AM
Miya added a comment.Jul 13 2016, 1:33 PM

@Elitre I don't fully understand what you need. Is it something to do with the message by とある白い猫 posted to Japanese Wikipedia about "Research:Revision scoring as a service"?

Or do you need a bad word list like below?

list made by Niconico live streaming service, a branch of Niconico

Ladsgroup added a comment.EditedJul 13 2016, 1:49 PM

@Miya: Hey, We are working on building anti-vandalism tools for Japanese Wikipedia using AI (for example see ORES in beta features in Wikidata). What we need right now is someone with knowledge of Japanese language to tell us how many of edits linked in T133405#2322879 (and marked as bad) are bad and how many of the edits marked as good are actually good. So we know about our false positives and we improve them. Please feel free to ask if anything is unclear.

@Ladsgroup , just a suggestion to please be as specific as possible when making such requests :) The page Miya needs to review is an archived version of a sandbox on Meta. Where is she supposed to comment and how? Thanks.

I now recommended either using https://ja.wikipedia.org/wiki/利用者:Elitre_(WMF)/ORES or Etherpad/Google Doc if multiple people need to edit. HTH.

Rxy added a subscriber: Rxy.Jan 12 2017, 4:06 AM

We just got pinged to re-consider this here: https://www.mediawiki.org/wiki/Topic:Ub6ir6tww9z81960

@Ladsgroup, from my skimming of the page and notes, it seems like the model is good enough to deploy. How did you generate this data?

Restricted Application added a project: artificial-intelligence. · View Herald TranscriptApr 13 2018, 2:20 PM

If you think so, I'm fine :D