Page MenuHomePhabricator

Revscoring: Statistic for multilabel classification
Closed, ResolvedPublic


Revscoring currently has provision for scoring items with a single target label. The relevant true positives generation happens with something like:

y_preds=[s[self.prediction_key] == label for s, l in score_labels]


To incorporate true positives correctly for multi-label cases, (where target label might be a list of categories an article belongs to), we need to check membership in label set rather than strict equality( == ) like:
y_preds = [label in s[self.prediction_key] for ...]

Also use this opportunity to define an overall strategy for handling multiclass classification scoring in revscoring with respect to different fitness statistics.