Contact: @jberkel
Description
Description
Event Timeline
Comment Actions
OK. I've got 40K sample and we had only 164 reverted cases. Wikidata case again ;)
we can run either 200k sample or go through dumps. @Halfak ideas?
Comment Actions
Wait, 40K changes and only 164 reversions? That sounds too low. Maybe lots of bot changes in there?
Comment Actions
(p3)ladsgroup@ores-compute-01:~/editquality$ grep "True" datasets/enwiktionary.rev_reverted.200k_2016.tsv | wc -l 815
That's acceptable but I think we should sample from false cases since feature extraction/training based on 200K is not very wise. I think 20K sampling would be enough. Let me do that and see what happens
Comment Actions
2016-06-30 13:09:23,221 INFO:revscoring.utilities.train_test -- Training model... 2016-06-30 13:09:36,251 INFO:revscoring.utilities.train_test -- Testing model... ScikitLearnClassifier - type: RF - params: max_depth=null, balanced_sample_weight=true, warm_start=false, class_weight=null, max_leaf_nodes=null, scale=true, max_features="log2", min_weight_fraction_leaf=0.0, min_samples_split=2, bootstrap=true, balanced_sample=false, oob_score=false, min_samples_leaf=3, criterion="entropy", random_state=null, verbose=0, center=true, n_estimators=320, n_jobs=1 - version: 0.0.1 - trained: 2016-06-30T13:09:36.246755 Table: ~False ~True ----- -------- ------- False 3974 30 True 39 101 Accuracy: 0.983 Precision: 0.771 Recall: 0.721 PR-AUC: 0.757 ROC-AUC: 0.985 Recall @ 0.1 false-positive rate: threshold=0.943, recall=0.129, fpr=0.1 Filter rate @ 0.9 recall: threshold=0.14, filter_rate=0.924, recall=0.9 Filter rate @ 0.75 recall: threshold=0.398, filter_rate=0.966, recall=0.75
Looks good
Comment Actions
Deployed now! See https://ores.wmflabs.org/v2/scores/enwiktionary/reverted/2345678 for an example scoring of https://en.wiktionary.org/wiki/?diff=2345678