Page MenuHomePhabricator

Train and test wp10 model for fawiki
Closed, ResolvedPublic

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

@Ladsgroup will be working on a merge_labels utility for wp10/wikilabels.

Model Information:
	 - type: GradientBoosting
	 - version: 0.6.0
	 - params: {'presort': 'auto', 'population_rates': None, 'learning_rate': 0.01, 'min_weight_fraction_leaf': 0.0, 'loss': 'deviance', 'max_depth': 7, 'labels': ['Stub', 'Start', 'C', 'B', 'GA', 'FA'], 'verbose': 0, 'init': None, 'min_samples_leaf': 1, 'max_leaf_nodes': None, 'center': True, 'scale': True, 'random_state': None, 'multilabel': False, 'n_estimators': 700, 'label_weights': None, 'subsample': 1.0, 'max_features': 'log2', 'warm_start': False, 'min_samples_split': 2}
	Environment:
	 - revscoring_version: '2.1.0'
	 - platform: 'Linux-4.9.0-6-amd64-x86_64-with-debian-9.4'
	 - machine: 'x86_64'
	 - version: '#1 SMP Debian 4.9.82-1+deb9u3 (2018-03-02)'
	 - system: 'Linux'
	 - processor: ''
	 - python_build: ('default', 'Jan 19 2017 14:11:04')
	 - python_compiler: 'GCC 6.3.0 20170118'
	 - python_branch: ''
	 - python_implementation: 'CPython'
	 - python_revision: ''
	 - python_version: '3.5.3'
	 - release: '4.9.0-6-amd64'
	
	Statistics:
	counts (n=665):
		label      n         ~Stub    ~Start    ~C    ~B    ~GA    ~FA
		-------  ---  ---  -------  --------  ----  ----  -----  -----
		'Stub'    24  -->       14         3     0     0      6      1
		'Start'   22  -->        4         2     3     6      7      0
		'C'       26  -->        1         1     2     1     18      3
		'B'       66  -->        1         0     1     9     41     14
		'GA'     273  -->        0         0     1    10    189     73
		'FA'     254  -->        1         1     0     6     80    166
	rates:
		              'Stub'    'Start'    'C'    'B'    'GA'    'FA'
		----------  --------  ---------  -----  -----  ------  ------
		sample         0.036      0.033  0.039  0.099   0.411   0.382
		population     0.036      0.033  0.039  0.099   0.411   0.382
	match_rate (micro=0.365, macro=0.167):
		  Stub      B     FA     GA      C    Start
		------  -----  -----  -----  -----  -------
		 0.032  0.048  0.386  0.513  0.011    0.011
	filter_rate (micro=0.635, macro=0.833):
		  Stub      B     FA     GA      C    Start
		------  -----  -----  -----  -----  -------
		 0.968  0.952  0.614  0.487  0.989    0.989
	recall (micro=0.574, macro=0.372):
		  Stub      B     FA     GA      C    Start
		------  -----  -----  -----  -----  -------
		 0.583  0.136  0.654  0.692  0.077    0.091
	!recall (micro=0.751, macro=0.888):
		  Stub      B     FA     GA      C    Start
		------  -----  -----  -----  -----  -------
		 0.989  0.962  0.779  0.612  0.992    0.992
	precision (micro=0.547, macro=0.453):
		  Stub      B     FA     GA      C    Start
		------  -----  -----  -----  -----  -------
		 0.667  0.281  0.646  0.554  0.286    0.286
	!precision (micro=0.799, macro=0.892):
		  Stub     B     FA     GA      C    Start
		------  ----  -----  -----  -----  -------
		 0.984  0.91  0.784  0.741  0.964     0.97
	f1 (micro=0.551, macro=0.388):
		  Stub      B    FA     GA      C    Start
		------  -----  ----  -----  -----  -------
		 0.622  0.184  0.65  0.616  0.121    0.138
	!f1 (micro=0.773, macro=0.889):
		  Stub      B     FA    GA      C    Start
		------  -----  -----  ----  -----  -------
		 0.987  0.935  0.781  0.67  0.978    0.981
	accuracy (micro=0.736, macro=0.858):
		  Stub     B     FA     GA      C    Start
		------  ----  -----  -----  -----  -------
		 0.974  0.88  0.731  0.645  0.956    0.962
	fpr (micro=0.249, macro=0.112):
		  Stub      B     FA     GA      C    Start
		------  -----  -----  -----  -----  -------
		 0.011  0.038  0.221  0.388  0.008    0.008
	roc_auc (micro=0.735, macro=0.806):
		  Stub      B     FA     GA      C    Start
		------  -----  -----  -----  -----  -------
		 0.987  0.663  0.749  0.694  0.811    0.934
	pr_auc (micro=0.537, macro=0.414):
		  Stub      B     FA     GA      C    Start
		------  -----  -----  -----  -----  -------
		 0.674  0.205  0.647  0.563  0.147    0.249
	
	 - score_schema: {'type': 'object', 'title': 'Scikit learn-based classifier score with probability', 'properties': {'probability': {'type': 'object', 'description': 'A mapping of probabilities onto each of the potential output labels', 'properties': {'Stub': 'number', 'B': 'number', 'FA': 'number', 'GA': 'number', 'C': 'number', 'Start': 'number'}}, 'prediction': {'type': 'string', 'description': 'The most likely label predicted by the estimator'}}}

This is somehow missing a ton of observations for the lower classes. Those should come out of one of the labeling campaigns.

That actually surprised me as well...

So, one of the labeling campaigns is probably not getting pulled in. Want to check on that?

I double checked that. Both are pulled in.

I checked on this with @Ladsgroup and it looks like we accidentally had people label the GA/FA set rather than the 600 observation sample. So I've loaded the 600 observation sample (see http://labels.wmflabs.org/ui/fawiki/). On the bright side, we'll have better data about the 300 GA/FA sample (not all were labeled GA/FA).

awight subscribed.

Looks like this is still blocked on the new labeling campaign (at 18%), moving out of the review column.