hoo@stat1007:~$ cat draftquality/model_info/enwiki.draft_quality.md Model Information: - type: GradientBoosting - version: 0.2.0 - params: {'max_depth': 5, 'min_impurity_decrease': 0.0, 'multilabel': False, 'min_weight_fraction_leaf': 0.0, 'scale': False, 'label_weights': None, 'random_state': None, 'subsample': 1.0, 'min_impurity_split': None, 'max_features': 'log2', 'n_estimators': 300, 'labels': ['OK', 'spam', 'vandalism', 'attack'], 'presort': 'auto', 'max_leaf_nodes': None, 'learning_rate': 0.1, 'center': False, 'verbose': 0, 'population_rates': None, 'init': None, 'min_samples_leaf': 1, 'criterion': 'friedman_mse', 'warm_start': False, 'loss': 'deviance', 'min_samples_split': 2} Environment: - revscoring_version: '2.3.0' - platform: 'Linux-4.9.0-8-amd64-x86_64-with-debian-9.5' - machine: 'x86_64' - version: '#1 SMP Debian 4.9.110-3+deb9u6 (2018-10-08)' - system: 'Linux' - processor: '' - python_build: ('default', 'Sep 27 2018 17:25:39') - python_compiler: 'GCC 6.3.0 20170516' - python_branch: '' - python_implementation: 'CPython' - python_revision: '' - python_version: '3.5.3' - release: '4.9.0-8-amd64' Statistics: counts (n=201261): label n ~OK ~spam ~vandalism ~attack ----------- ------ --- ------ ------- ------------ --------- 'OK' 175000 --> 171425 2656 865 54 'spam' 17699 --> 2763 14037 865 34 'vandalism' 6503 --> 1596 1351 3213 343 'attack' 2059 --> 252 350 1117 340 rates: 'OK' 'spam' 'vandalism' 'attack' ---------- ------ -------- ------------- ---------- sample 0.87 0.088 0.032 0.01 population 0.971 0.02 0.007 0.002 match_rate (micro=0.93, macro=0.254): OK vandalism spam attack ----- ----------- ------ -------- 0.956 0.018 0.039 0.003 filter_rate (micro=0.07, macro=0.746): OK vandalism spam attack ----- ----------- ------ -------- 0.044 0.982 0.961 0.997 recall (micro=0.971, macro=0.608): OK vandalism spam attack ---- ----------- ------ -------- 0.98 0.494 0.793 0.165 !recall (micro=0.829, macro=0.946): OK vandalism spam attack ----- ----------- ------ -------- 0.824 0.985 0.976 0.998 precision (micro=0.975, macro=0.434): OK vandalism spam attack ----- ----------- ------ -------- 0.995 0.196 0.399 0.148 !precision (micro=0.559, macro=0.884): OK vandalism spam attack ----- ----------- ------ -------- 0.546 0.996 0.996 0.998 f1 (micro=0.971, macro=0.489): OK vandalism spam attack ----- ----------- ------ -------- 0.987 0.281 0.531 0.156 !f1 (micro=0.667, macro=0.908): OK vandalism spam attack ----- ----------- ------ -------- 0.657 0.991 0.986 0.998 accuracy (micro=0.975, macro=0.981): OK vandalism spam attack ----- ----------- ------ -------- 0.975 0.982 0.973 0.996 fpr (micro=0.171, macro=0.054): OK vandalism spam attack ----- ----------- ------ -------- 0.176 0.015 0.024 0.002 roc_auc (micro=0.979, macro=0.971): OK vandalism spam attack ----- ----------- ------ -------- 0.979 0.956 0.979 0.968 pr_auc (micro=0.984, macro=0.479): OK vandalism spam attack ----- ----------- ------ -------- 0.999 0.207 0.612 0.097 - score_schema: {'title': 'Scikit learn-based classifier score with probability', 'type': 'object', 'properties': {'probability': {'type': 'object', 'description': 'A mapping of probabilities onto each of the potential output labels', 'properties': {'OK': {'type': 'number'}, 'vandalism': {'type': 'number'}, 'spam': {'type': 'number'}, 'attack': {'type': 'number'}}}, 'prediction': {'type': 'string', 'description': 'The most likely label predicted by the estimator'}}}