Add new data for damaging models of Persian Wikipedia
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	• AnotherLadsgroup
	Jul 18 2017, 5:43 PM

Related Objects
Search...

Status	Assigned	Task
Resolved	Halfak	T171505 Late-July 2017 ORES deploy
Resolved	Ladsgroup	T170960 Add new data for damaging models of Persian Wikipedia
Resolved	Ladsgroup	T171386 Investigate small loss in fitness with the new data in fawiki

Event Timeline

• AnotherLadsgroup created this task.Jul 18 2017, 5:43 PM

Restricted Application added a project: artificial-intelligence. · View Herald TranscriptJul 18 2017, 5:43 PM

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Ladsgroup claimed this task.Jul 19 2017, 7:07 AM

Restricted Application added a project: User-Ladsgroup. · View Herald TranscriptJul 19 2017, 7:07 AM

Ladsgroup moved this task from Incoming to In progress on the User-Ladsgroup board.Jul 19 2017, 6:36 PM

Reverted:

ScikitLearnClassifier
 - type: GradientBoosting
 - params: init=null, max_leaf_nodes=null, min_samples_split=2, presort="auto", warm_start=false, min_weight_fraction_leaf=0.0, max_features="log2", max_depth=7, learning_rate=0.01, verbose=0, subsample=1.0, loss="deviance", scale=true, n_estimators=500, balanced_sample=false, criterion="friedman_mse", min_samples_leaf=1, center=true, balanced_sample_weight=true, random_state=null, min_impurity_split=1e-07
 - version: 0.3.0
 - trained: 2017-07-22T07:57:02.413279

Table:
	         ~False    ~True
	-----  --------  -------
	False     34858     3228
	True        232      934

Accuracy: 0.912
Precision:
	-----  -----
	False  0.993
	True   0.224
	-----  -----

Recall:
	-----  -----
	False  0.915
	True   0.8
	-----  -----

PR-AUC:
	-----  -----
	False  0.994
	True   0.343
	-----  -----

ROC-AUC:
	-----  -----
	False  0.936
	True   0.939
	-----  -----

Recall @ 0.1 false-positive rate:
	label      threshold    recall    fpr
	-------  -----------  --------  -----
	False          0.734     0.823  0.097
	True           0.445     0.83   0.096

Filter rate @ 0.9 recall:
	label      threshold    filter_rate    recall
	-------  -----------  -------------  --------
	False          0.572          0.122     0.9
	True           0.267          0.802     0.903

Filter rate @ 0.75 recall:
	label      threshold    filter_rate    recall
	-------  -----------  -------------  --------
	False          0.828          0.27      0.75
	True           0.597          0.911     0.754

Recall @ 0.995 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.605     0.88         0.995
	True           0.952     0.034        1

Recall @ 0.99 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.33      0.943         0.99
	True           0.952     0.034         1

Recall @ 0.98 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.106     0.983         0.98
	True           0.952     0.034         1

Recall @ 0.9 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.046     1             0.97
	True           0.952     0.036         0.98

Recall @ 0.75 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.046     1            0.97
	True           0.948     0.047        0.849

Recall @ 0.6 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.046     1            0.97
	True           0.943     0.073        0.672

Recall @ 0.45 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.046     1            0.97
	True           0.92      0.225        0.459

Recall @ 0.15 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.046     1            0.97
	True           0.301     0.886        0.156

Damaging:

ScikitLearnClassifier
 - type: GradientBoosting
 - params: n_estimators=300, warm_start=false, random_state=null, min_samples_leaf=1, balanced_sample=false, criterion="friedman_mse", max_features="log2", init=null, max_leaf_nodes=null, center=true, presort="auto", subsample=1.0, scale=true, min_samples_split=2, verbose=0, balanced_sample_weight=true, max_depth=3, min_weight_fraction_leaf=0.0, min_impurity_split=1e-07, loss="deviance", learning_rate=0.1
 - version: 0.3.0
 - trained: 2017-07-22T08:13:52.189658

Table:
	         ~False    ~True
	-----  --------  -------
	False     34778     3317
	True         75     1082

Accuracy: 0.914
Precision:
	-----  -----
	False  0.998
	True   0.246
	-----  -----

Recall:
	-----  -----
	False  0.913
	True   0.935
	-----  -----

PR-AUC:
	-----  -----
	False  0.994
	True   0.403
	-----  -----

ROC-AUC:
	-----  -----
	False  0.964
	True   0.973
	-----  -----

Recall @ 0.1 false-positive rate:
	label      threshold    recall    fpr
	-------  -----------  --------  -----
	False          0.396     0.923  0.093
	True           0.34      0.969  0.097

Filter rate @ 0.9 recall:
	label      threshold    filter_rate    recall
	-------  -----------  -------------  --------
	False          0.693          0.125     0.9
	True           0.607          0.898     0.905

Filter rate @ 0.75 recall:
	label      threshold    filter_rate    recall
	-------  -----------  -------------  --------
	False          0.988          0.27      0.752
	True           0.818          0.929     0.753

Recall @ 0.995 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.267     0.936        0.995
	True           0.987     0.019        1

Recall @ 0.99 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.142     0.958         0.99
	True           0.987     0.019         1

Recall @ 0.98 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.054     0.987         0.98
	True           0.987     0.019         1

Recall @ 0.9 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.011      1           0.971
	True           0.987      0.02        0.983

Recall @ 0.75 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.011     1            0.971
	True           0.98      0.075        0.864

Recall @ 0.6 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.011     1            0.971
	True           0.975     0.107        0.672

Recall @ 0.45 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.011     1            0.971
	True           0.949     0.311        0.479

Recall @ 0.15 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.011     1            0.971
	True           0.059     0.996        0.172

Goodfaith:

ScikitLearnClassifier
 - type: GradientBoosting
 - params: loss="deviance", init=null, presort="auto", min_weight_fraction_leaf=0.0, min_impurity_split=1e-07, max_depth=7, min_samples_leaf=1, max_features="log2", criterion="friedman_mse", warm_start=false, random_state=null, verbose=0, balanced_sample_weight=true, center=true, n_estimators=500, balanced_sample=false, subsample=1.0, min_samples_split=2, learning_rate=0.01, scale=true, max_leaf_nodes=null
 - version: 0.3.0
 - trained: 2017-07-22T09:52:28.507715

Table:
	         ~False    ~True
	-----  --------  -------
	False       582       88
	True       3126    35456

Accuracy: 0.918
Precision:
	-----  -----
	False  0.157
	True   0.998
	-----  -----

Recall:
	-----  -----
	False  0.869
	True   0.919
	-----  -----

PR-AUC:
	-----  -----
	False  0.242
	True   0.995
	-----  -----

ROC-AUC:
	-----  -----
	False  0.967
	True   0.957
	-----  -----

Recall @ 0.1 false-positive rate:
	label      threshold    recall    fpr
	-------  -----------  --------  -----
	False          0.313     0.943  0.098
	True           0.557     0.914  0.093

Filter rate @ 0.9 recall:
	label      threshold    filter_rate    recall
	-------  -----------  -------------  --------
	False          0.446          0.9       0.907
	True           0.712          0.114     0.9

Filter rate @ 0.75 recall:
	label      threshold    filter_rate    recall
	-------  -----------  -------------  --------
	False          0.68           0.928     0.754
	True           0.967          0.263     0.75

Recall @ 0.995 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.954     0.026        1
	True           0.3       0.943        0.995

Recall @ 0.99 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.954     0.026         1
	True           0.138     0.972         0.99

Recall @ 0.98 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.954     0.026        1
	True           0.042     1            0.983

Recall @ 0.9 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.954     0.026        1
	True           0.042     1            0.983

Recall @ 0.75 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.95      0.044        0.906
	True           0.042     1            0.983

Recall @ 0.6 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.946     0.055        0.786
	True           0.042     1            0.983

Recall @ 0.45 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.94      0.085        0.56
	True           0.042     1            0.983

Recall @ 0.15 precision:
	label      threshold    recall    precision
	-------  -----------  --------  -----------
	False          0.423     0.903        0.153
	True           0.042     1            0.983