- Develop a score processor that can operate on a last-revision article XML dump
- Run the processor and obtain scores
- prediction (Stub, Start, C, B, GA, FA)
- weighted_sum (a weighted sum of probabilities that allows us inter-class measurements)
Re. weighted_sum, see https://github.com/halfak/quality-bias/blob/master/score_revisions.py#L88 This strategy seems to produce stable measurements.