| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Resolved | • johl | T127047 Collection of topics for HPI hackathon | |||
| Resolved | awight | T187836 [Epic] Audit of pending ORES GUI deployments | |||
| Resolved | Glorian_WD | T127470 Deploy item quality classification model for Wikidata | |||
| Resolved | Glorian_WD | T157498 Train/test item quality model for Wikidata | |||
| Resolved | Ladsgroup | T164862 Train a basic item quality based on edit quality for Wikidata |
Event Timeline
Comment Actions
@Ladsgroup : are you going to train with a single classifier? or are you going to train with multiple classifiers and measure the result to find which of the classifiers which has the best accuracy?
Comment Actions
@Glorian_WD, we use revscoring tune to do estimator and hyperparameter optimization. So, we'll likely test out a set of benchmark models (naive bayes, logistic regression, etc.) as well as a large set of parameters for Random Forest and Gradient Boosting.
Comment Actions
2017-05-20 17:01:35,365 INFO:revscoring.utilities.cv_train -- Cross-validating model statistics for 10 folds... 2017-05-20 17:01:35,428 INFO:revscoring.scorer_models.sklearn_classifier -- Performing cross-validation 1... 2017-05-20 17:01:35,451 INFO:revscoring.scorer_models.sklearn_classifier -- Performing cross-validation 2... 2017-05-20 17:01:35,462 INFO:revscoring.scorer_models.sklearn_classifier -- Performing cross-validation 3... 2017-05-20 17:01:35,483 INFO:revscoring.scorer_models.sklearn_classifier -- Performing cross-validation 5... 2017-05-20 17:01:35,508 INFO:revscoring.scorer_models.sklearn_classifier -- Performing cross-validation 4... 2017-05-20 17:01:35,517 INFO:revscoring.scorer_models.sklearn_classifier -- Performing cross-validation 6... 2017-05-20 17:01:35,506 INFO:revscoring.scorer_models.sklearn_classifier -- Performing cross-validation 7... 2017-05-20 17:01:35,528 INFO:revscoring.scorer_models.sklearn_classifier -- Performing cross-validation 8... 2017-05-20 17:01:42,878 INFO:revscoring.scorer_models.sklearn_classifier -- Performing cross-validation 9... 2017-05-20 17:01:42,925 INFO:revscoring.scorer_models.sklearn_classifier -- Performing cross-validation 10... 2017-05-20 17:01:48,206 INFO:revscoring.utilities.cv_train -- Training model on all data... ScikitLearnClassifier - type: RF - params: verbose=0, scale=true, criterion="gini", balanced_sample_weight=false, min_samples_split=2, max_features="log2", n_jobs=1, min_weight_fraction_leaf=0.0, warm_start=false, center=true, balanced_sample=true, oob_score=false, class_weight=null, random_state=null, bootstrap=true, max_leaf_nodes=null, n_estimators=20, max_depth=null, min_samples_leaf=13 - version: .0 - trained: 2017-05-20T17:01:48.950545 Table: ~A ~B ~C ~D ~E -- ---- ---- ---- ---- ---- A 279 33 10 0 0 B 64 291 77 5 1 C 63 208 1414 86 2 D 0 1 65 894 37 E 0 0 5 103 1361 Accuracy: 0.848 ROC-AUC: --- ----- 'A' 0.987 'B' 0.937 'C' 0.969 'D' 0.977 'E' 0.993 --- ----- F1: - ----- E 0.948 B 0.595 D 0.858 A 0.764 C 0.845 - -----