Page MenuHomePhabricator

Extracted labels might not be accurate when there are multiple reverts
Closed, ResolvedPublic

Description

If the quality of a page evolves as 2→3→4→2→4, the extractor ignores the 3, and this is not ideal if the change 4→2 was made by a vandal, and 2→4 was made by a patroller who rolled back the change.

This happens because if a many revisions are reverted, and then at some point a future edit is reverted_to one of those revisions (e.g. the latest one), then only the labels from that reverted_to revision are marked back as lab['reverted'] = False by the extractor.py. This leaves the others with lab['reverted'] = True, from when they were initially reverted.