Page MenuHomePhabricator

Add new data for damaging models of Persian Wikipedia
Closed, ResolvedPublic

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Reverted:

ScikitLearnClassifier
 - type: GradientBoosting
 - params: init=null, max_leaf_nodes=null, min_samples_split=2, presort="auto", warm_start=false, min_weight_fraction_leaf=0.0, max_features="log2", max_depth=7, learning_rate=0.01, verbose=0, subsample=1.0, loss="deviance", scale=true, n_estimators=500, balanced_sample=false, criterion="friedman_mse", min_samples_leaf=1, center=true, balanced_sample_weight=true, random_state=null, min_impurity_split=1e-07
 - version: 0.3.0
 - trained: 2017-07-22T07:57:02.413279

Table:
	         ~False    ~True
	-----  --------  -------
	False     34858     3228
	True        232      934

Accuracy: 0.912
Precision:
	-----  -----
	False  0.993
	True   0.224
	-----  -----

Recall:
	-----  -----
	False  0.915
	True   0.8
	-----  -----

PR-AUC:
	-----  -----
	False  0.994
	True   0.343
	-----  -----

ROC-AUC:
	-----  -----
	False  0.936
	True   0.939
	-----  -----

Recall @ 0.1 false-positive rate:
	label      threshold    recall    fpr
	-------  -----------  --------  -----
	False          0.734     0.823  0.097
	True           0.445     0.83   0.096

Filter rate @ 0.9 recall:
	label      threshold    filter_rate    recall
	-------  -----------  -------------  --------
	False          0.572          0.122     0.9
	True           0.267          0.802     0.903

Filter rate @ 0.75 recall:
	label      threshold    filter_rate    recall
	-------  -----------  -------------  --------
	False          0.828          0.27      0.75
	True           0.597          0.911     0.754

Recall @ 0.995 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.605     0.88         0.995
	True           0.952     0.034        1

Recall @ 0.99 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.33      0.943         0.99
	True           0.952     0.034         1

Recall @ 0.98 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.106     0.983         0.98
	True           0.952     0.034         1

Recall @ 0.9 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.046     1             0.97
	True           0.952     0.036         0.98

Recall @ 0.75 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.046     1            0.97
	True           0.948     0.047        0.849

Recall @ 0.6 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.046     1            0.97
	True           0.943     0.073        0.672

Recall @ 0.45 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.046     1            0.97
	True           0.92      0.225        0.459

Recall @ 0.15 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.046     1            0.97
	True           0.301     0.886        0.156

Damaging:

ScikitLearnClassifier
 - type: GradientBoosting
 - params: n_estimators=300, warm_start=false, random_state=null, min_samples_leaf=1, balanced_sample=false, criterion="friedman_mse", max_features="log2", init=null, max_leaf_nodes=null, center=true, presort="auto", subsample=1.0, scale=true, min_samples_split=2, verbose=0, balanced_sample_weight=true, max_depth=3, min_weight_fraction_leaf=0.0, min_impurity_split=1e-07, loss="deviance", learning_rate=0.1
 - version: 0.3.0
 - trained: 2017-07-22T08:13:52.189658

Table:
	         ~False    ~True
	-----  --------  -------
	False     34778     3317
	True         75     1082

Accuracy: 0.914
Precision:
	-----  -----
	False  0.998
	True   0.246
	-----  -----

Recall:
	-----  -----
	False  0.913
	True   0.935
	-----  -----

PR-AUC:
	-----  -----
	False  0.994
	True   0.403
	-----  -----

ROC-AUC:
	-----  -----
	False  0.964
	True   0.973
	-----  -----

Recall @ 0.1 false-positive rate:
	label      threshold    recall    fpr
	-------  -----------  --------  -----
	False          0.396     0.923  0.093
	True           0.34      0.969  0.097

Filter rate @ 0.9 recall:
	label      threshold    filter_rate    recall
	-------  -----------  -------------  --------
	False          0.693          0.125     0.9
	True           0.607          0.898     0.905

Filter rate @ 0.75 recall:
	label      threshold    filter_rate    recall
	-------  -----------  -------------  --------
	False          0.988          0.27      0.752
	True           0.818          0.929     0.753

Recall @ 0.995 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.267     0.936        0.995
	True           0.987     0.019        1

Recall @ 0.99 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.142     0.958         0.99
	True           0.987     0.019         1

Recall @ 0.98 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.054     0.987         0.98
	True           0.987     0.019         1

Recall @ 0.9 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.011      1           0.971
	True           0.987      0.02        0.983

Recall @ 0.75 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.011     1            0.971
	True           0.98      0.075        0.864

Recall @ 0.6 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.011     1            0.971
	True           0.975     0.107        0.672

Recall @ 0.45 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.011     1            0.971
	True           0.949     0.311        0.479

Recall @ 0.15 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.011     1            0.971
	True           0.059     0.996        0.172

Goodfaith:

ScikitLearnClassifier
 - type: GradientBoosting
 - params: loss="deviance", init=null, presort="auto", min_weight_fraction_leaf=0.0, min_impurity_split=1e-07, max_depth=7, min_samples_leaf=1, max_features="log2", criterion="friedman_mse", warm_start=false, random_state=null, verbose=0, balanced_sample_weight=true, center=true, n_estimators=500, balanced_sample=false, subsample=1.0, min_samples_split=2, learning_rate=0.01, scale=true, max_leaf_nodes=null
 - version: 0.3.0
 - trained: 2017-07-22T09:52:28.507715

Table:
	         ~False    ~True
	-----  --------  -------
	False       582       88
	True       3126    35456

Accuracy: 0.918
Precision:
	-----  -----
	False  0.157
	True   0.998
	-----  -----

Recall:
	-----  -----
	False  0.869
	True   0.919
	-----  -----

PR-AUC:
	-----  -----
	False  0.242
	True   0.995
	-----  -----

ROC-AUC:
	-----  -----
	False  0.967
	True   0.957
	-----  -----

Recall @ 0.1 false-positive rate:
	label      threshold    recall    fpr
	-------  -----------  --------  -----
	False          0.313     0.943  0.098
	True           0.557     0.914  0.093

Filter rate @ 0.9 recall:
	label      threshold    filter_rate    recall
	-------  -----------  -------------  --------
	False          0.446          0.9       0.907
	True           0.712          0.114     0.9

Filter rate @ 0.75 recall:
	label      threshold    filter_rate    recall
	-------  -----------  -------------  --------
	False          0.68           0.928     0.754
	True           0.967          0.263     0.75

Recall @ 0.995 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.954     0.026        1
	True           0.3       0.943        0.995

Recall @ 0.99 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.954     0.026         1
	True           0.138     0.972         0.99

Recall @ 0.98 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.954     0.026        1
	True           0.042     1            0.983

Recall @ 0.9 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.954     0.026        1
	True           0.042     1            0.983

Recall @ 0.75 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.95      0.044        0.906
	True           0.042     1            0.983

Recall @ 0.6 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.946     0.055        0.786
	True           0.042     1            0.983

Recall @ 0.45 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.94      0.085        0.56
	True           0.042     1            0.983

Recall @ 0.15 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.423     0.903        0.153
	True           0.042     1            0.983