Page MenuHomePhabricator

Train and test damaging/goodfaith models for Czech Wikipedia
Closed, ResolvedPublic

Event Timeline

# Model tuning report
- Revscoring version: 1.3.5
- Features: editquality.feature_lists.cswiki.damaging
- Date: 2017-01-28T17:42:35.310686
- Observations: 4925
- Labels: [false, true]
- Scoring: roc_auc
- Folds: 5

# Top scoring configurations
| model                      |   mean(scores) |   std(scores) | params                                                                         |
|:---------------------------|---------------:|--------------:|:-------------------------------------------------------------------------------|
| RandomForestClassifier     |          0.834 |         0.016 | max_features="log2", criterion="entropy", n_estimators=640, min_samples_leaf=3 |
| RandomForestClassifier     |          0.834 |         0.017 | max_features="log2", criterion="entropy", n_estimators=160, min_samples_leaf=5 |
| RandomForestClassifier     |          0.833 |         0.016 | max_features="log2", criterion="entropy", n_estimators=640, min_samples_leaf=5 |
| GradientBoostingClassifier |          0.831 |         0.021 | max_features="log2", learning_rate=0.01, n_estimators=500, max_depth=7         |
| RandomForestClassifier     |          0.831 |         0.017 | max_features="log2", criterion="entropy", n_estimators=320, min_samples_leaf=3 |
| GradientBoostingClassifier |          0.831 |         0.021 | max_features="log2", learning_rate=0.01, n_estimators=700, max_depth=7         |
| RandomForestClassifier     |          0.83  |         0.02  | max_features="log2", criterion="entropy", n_estimators=640, min_samples_leaf=1 |
| RandomForestClassifier     |          0.83  |         0.012 | max_features="log2", criterion="entropy", n_estimators=160, min_samples_leaf=3 |
| RandomForestClassifier     |          0.83  |         0.015 | max_features="log2", criterion="entropy", n_estimators=640, min_samples_leaf=7 |
| GradientBoostingClassifier |          0.83  |         0.019 | max_features="log2", learning_rate=0.01, n_estimators=500, max_depth=5         |

Model for damaging:

 - type: GradientBoosting
 - params: balanced_sample_weight=true, init=null, min_samples_split=2, center=true, subsample=1.0, min_weight_fraction_leaf=0.0, balanced_sample=false, n_estimators=500, random_state=null, presort="auto", warm_start=false, scale=true, verbose=0, max_depth=7, learning_rate=0.01, min_samples_leaf=1, max_features="log2", loss="deviance", max_leaf_nodes=null
 - version: 0.3.0
 - trained: 2017-01-28T18:08:59.959144
Table:
	         ~False    ~True
	-----  --------  -------
	False      4032      444
	True        208      241

Accuracy: 0.868
Precision:
	-----  -----
	False  0.951
	True   0.348
	-----  -----

Recall:
	-----  -----
	False  0.901
	True   0.534
	-----  -----

PR-AUC:
	-----  -----
	False  0.976
	True   0.423
	-----  -----

ROC-AUC:
	-----  -----
	False  0.835
	True   0.835
	-----  -----

Recall @ 0.1 false-positive rate:
	label      threshold    recall    fpr
	-------  -----------  --------  -----
	False          0.856     0.534  0.088
	True           0.508     0.547  0.091

Recall @ 0.98 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.837     0.56         0.982
	True           0.885     0.059        1

Recall @ 0.9 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.118     0.997        0.914
	True           0.881     0.067        0.991

Recall @ 0.45 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.095     1            0.91
	True           0.64      0.409        0.474

Recall @ 0.15 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.095     1            0.91
	True           0.144     0.903        0.161

Goodfaith:

 - type: GradientBoosting
 - params: presort="auto", min_samples_leaf=1, max_leaf_nodes=null, min_samples_split=2, center=true, max_depth=5, balanced_sample_weight=true, min_weight_fraction_leaf=0.0, verbose=0, balanced_sample=false, random_state=null, max_features="log2", subsample=1.0, n_estimators=500, scale=true, warm_start=false, loss="deviance", learning_rate=0.01, init=null
 - version: 0.3.0
 - trained: 2017-01-28T18:10:19.008889
Table:
	         ~False    ~True
	-----  --------  -------
	False       159       59
	True        522     4185

Accuracy: 0.882
Precision:
	-----  -----
	False  0.23
	True   0.986
	-----  -----

Recall:
	-----  -----
	False  0.722
	True   0.889
	-----  -----

PR-AUC:
	-----  -----
	False  0.459
	True   0.991
	-----  -----

ROC-AUC:
	-----  -----
	False  0.888
	True   0.888
	-----  -----

Recall @ 0.1 false-positive rate:
	label      threshold    recall    fpr
	-------  -----------  --------  -----
	False          0.568     0.708  0.087
	True           0.847     0.576  0.075

Recall @ 0.98 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.914     0.111        1
	True           0.337     0.951        0.982

Recall @ 0.9 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.914     0.111        1
	True           0.079     1            0.957

Recall @ 0.45 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.773     0.465        0.518
	True           0.079     1            0.957

Recall @ 0.15 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.267     0.836        0.17
	True           0.079     1            0.957
ScikitLearnClassifier
 - type: GradientBoosting
 - params: warm_start=false, min_samples_split=2, loss="deviance", init=null, min_weight_fraction_leaf=0.0, min_samples_leaf=1, max_depth=7, presort="auto", max_leaf_nodes=null, verbose=0, balanced_sample_weight=true, random_state=null, balanced_sample=false, max_features="log2", n_estimators=500, scale=true, center=true, learning_rate=0.01, subsample=1.0
 - version: 0.3.0
 - trained: 2017-01-28T20:18:51.260027

Table:
	         ~False    ~True
	-----  --------  -------
	False     17812     1193
	True         96      741

Accuracy: 0.935
Precision:
	-----  -----
	False  0.995
	True   0.383
	-----  -----

Recall:
	-----  -----
	False  0.937
	True   0.884
	-----  -----

PR-AUC:
	-----  -----
	False  0.995
	True   0.802
	-----  -----

ROC-AUC:
	-----  -----
	False  0.969
	True   0.965
	-----  -----

Recall @ 0.1 false-positive rate:
	label      threshold    recall    fpr
	-------  -----------  --------  -----
	False          0.572     0.916  0.093
	True           0.369     0.918  0.094

Recall @ 0.98 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.122     0.997        0.981
	True           0.919     0.345        0.996

Recall @ 0.9 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.051     1            0.963
	True           0.88      0.562        0.912

Recall @ 0.45 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.051     1            0.963
	True           0.61      0.863        0.474

Recall @ 0.15 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.051     1            0.963
	True           0.16      0.955        0.175
ScikitLearnClassifier
 - type: GradientBoosting
 - params: center=true, scale=true, min_weight_fraction_leaf=0.0, n_estimators=500, presort="auto", max_leaf_nodes=null, warm_start=false, verbose=0, max_features="log2", balanced_sample=false, init=null, loss="deviance", balanced_sample_weight=true, learning_rate=0.01, min_samples_leaf=1, min_samples_split=2, subsample=1.0, random_state=null, max_depth=5
 - version: 0.3.0
 - trained: 2017-01-28T22:26:31.693530

Table:
	         ~False    ~True
	-----  --------  -------
	False       368       31
	True       1079    18364

Accuracy: 0.944
Precision:
	-----  -----
	False  0.254
	True   0.998
	-----  -----

Recall:
	-----  -----
	False  0.922
	True   0.945
	-----  -----

PR-AUC:
	-----  -----
	False  0.692
	True   0.995
	-----  -----

ROC-AUC:
	-----  -----
	False  0.969
	True   0.971
	-----  -----

Recall @ 0.1 false-positive rate:
	label      threshold    recall    fpr
	-------  -----------  --------  -----
	False          0.31      0.951  0.081
	True           0.368     0.954  0.088

Recall @ 0.98 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.952      0.21        1
	True           0.046      1           0.983

Recall @ 0.9 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.94      0.334        0.936
	True           0.046     1            0.983

Recall @ 0.45 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.839     0.779        0.476
	True           0.046     1            0.983

Recall @ 0.15 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.306     0.945        0.197
	True           0.046     1            0.983

@Halfak If this is done (according to the column) I think we should close it as resolved, shouldn't we?

Subscribing is enough for me right now :).

Halfak renamed this task from Train and test editquality models for Czech Wikipedia to Train and test damaging/goodfaith models for Czech Wikipedia.Jan 30 2017, 5:25 PM

This is now deployed in WMFLabs. Next step is production and then enabling the ORES Review Tool. Stay tuned.

https://ores.wmflabs.org/v2/scores/cswiki/?models=damaging|goodfaith&model_info=trained|type