Train/test reverted model for enwiktionary
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Halfak
	Jun 24 2016, 9:25 PM

Description

Contact: @jberkel

Event Timeline

Halfak created this task.Jun 24 2016, 9:25 PM

Restricted Application added subscribers: Zppix, Aklapper. · View Herald TranscriptJun 24 2016, 9:25 PM

Halfak updated the task description. (Show Details)Jun 24 2016, 9:27 PM

Halfak added a subscriber: jberkel.

Ladsgroup claimed this task.Jun 25 2016, 10:05 AM

Ladsgroup edited projects, added Machine-Learning-Team (Active Tasks); removed Machine-Learning-Team.

Ladsgroup moved this task from Parked to Backlog on the Machine-Learning-Team (Active Tasks) board.

OK. I've got 40K sample and we had only 164 reverted cases. Wikidata case again ;)

we can run either 200k sample or go through dumps. @Halfak ideas?

Wait, 40K changes and only 164 reversions? That sounds too low. Maybe lots of bot changes in there?

yeah, probably. It's okay. I go with 200K and the total number should be fine.

(p3)ladsgroup@ores-compute-01:~/editquality$ grep "True" datasets/enwiktionary.rev_reverted.200k_2016.tsv | wc -l
815

That's acceptable but I think we should sample from false cases since feature extraction/training based on 200K is not very wise. I think 20K sampling would be enough. Let me do that and see what happens

2016-06-30 13:09:23,221 INFO:revscoring.utilities.train_test -- Training model...
2016-06-30 13:09:36,251 INFO:revscoring.utilities.train_test -- Testing model...
ScikitLearnClassifier
 - type: RF
 - params: max_depth=null, balanced_sample_weight=true, warm_start=false, class_weight=null, max_leaf_nodes=null, scale=true, max_features="log2", min_weight_fraction_leaf=0.0, min_samples_split=2, bootstrap=true, balanced_sample=false, oob_score=false, min_samples_leaf=3, criterion="entropy", random_state=null, verbose=0, center=true, n_estimators=320, n_jobs=1
 - version: 0.0.1
 - trained: 2016-06-30T13:09:36.246755

Table:
                 ~False    ~True
        -----  --------  -------
        False      3974       30
        True         39      101

Accuracy: 0.983
Precision: 0.771
Recall: 0.721
PR-AUC: 0.757
ROC-AUC: 0.985
Recall @ 0.1 false-positive rate: threshold=0.943, recall=0.129, fpr=0.1
Filter rate @ 0.9 recall: threshold=0.14, filter_rate=0.924, recall=0.9
Filter rate @ 0.75 recall: threshold=0.398, filter_rate=0.966, recall=0.75

Looks good