Page MenuHomePhabricator

[research] Why is the japanese 'reverted' model so bad?
Open, LowPublic

Description

See https://github.com/wiki-ai/editquality/pull/24

A logistic regression fits better than RF or GB. Why? What's going on here?

Event Timeline

Here's my first results.

  • Number of reverted edits are super small (382) which means we are prone to overfitting.

Let me increase number of edits to 40K increase radius to 7 revision. We might get something.

With 40K edits:

ScikitLearnClassifier
 - type: GradientBoosting
 - params: balanced_sample=false, max_leaf_nodes=null, center=true, presort="auto", learning_rate=0.01, init=null, verbose=0, min_samples_leaf=1, n_estimators=700, max_features="log2", balanced_sample_weight=true, random_state=null, scale=true, max_depth=7, warm_start=false, loss="deviance", min_samples_split=2, subsample=1.0, min_weight_fraction_leaf=0.0
 - version: 0.0.1
 - trained: 2016-05-21T13:00:24.789069

Table:
                 ~False    ~True
        -----  --------  -------
        False      6710     1055
        True         99       97

Accuracy: 0.855
Precision: 0.084
Recall: 0.495
PR-AUC: 0.122
ROC-AUC: 0.804
Recall @ 0.1 false-positive rate: threshold=0.963, recall=0.005, fpr=0.0
Filter rate @ 0.9 recall: threshold=0.129, filter_rate=0.505, recall=0.903
Filter rate @ 0.75 recall: threshold=0.294, filter_rate=0.708, recall=0.75

It sounds much better. Let's see if we need to do other stuff

Another thing I learned from Japanese Wikipedia is that number of unregistered users doing good edit is much more than other wikis. Just checkout their RC. Causing features such as user age loose their predictive value. And since we can't get much signal from Japanese text (no dict words, etc.), Our models won't as good as we want unless we add features exclusively for Japanese Wikipedia/Japanese language.

P3161 and P3162 are cases that a native Japanese speaker can help.

@Ladsgroup, Can you please put these on a wiki and use [[:ja:Special:Diff/...]] to make them linked nicely?

Ladsgroup subscribed.
Halfak triaged this task as Low priority.Jul 5 2016, 2:34 PM

@Miya, do you have any interest in teaching a computer to identify "bad" edits at the Japanese Wikipedia? Do you know anyone at jawiki who might be interested?

@Elitre I don't fully understand what you need. Is it something to do with the message by とある白い猫 posted to Japanese Wikipedia about "Research:Revision scoring as a service"?

Or do you need a bad word list like below?

list made by Niconico live streaming service, a branch of Niconico

@Miya: Hey, We are working on building anti-vandalism tools for Japanese Wikipedia using AI (for example see ORES in beta features in Wikidata). What we need right now is someone with knowledge of Japanese language to tell us how many of edits linked in T133405#2322879 (and marked as bad) are bad and how many of the edits marked as good are actually good. So we know about our false positives and we improve them. Please feel free to ask if anything is unclear.

@Ladsgroup , just a suggestion to please be as specific as possible when making such requests :) The page Miya needs to review is an archived version of a sandbox on Meta. Where is she supposed to comment and how? Thanks.

I now recommended either using https://ja.wikipedia.org/wiki/利用者:Elitre_(WMF)/ORES or Etherpad/Google Doc if multiple people need to edit. HTH.

We just got pinged to re-consider this here: https://www.mediawiki.org/wiki/Topic:Ub6ir6tww9z81960

@Ladsgroup, from my skimming of the page and notes, it seems like the model is good enough to deploy. How did you generate this data?