Page MenuHomePhabricator

Create a maintenance script for filling mw-manual-revert and mw-reverted tags
Open, Needs TriagePublic

Description

When T256001: Detect manual reverts completed, MW started marking manual reverts with an appropriate change tag. Once T254074: Implement the reverted edit tag completes, the same will be true for reverted edits.

This is fine, but existing installations won't have these tags populated on old edits. I propose writing a maintenance script that would go through all edits on a wiki and determine which are manual reverts and which were reverted.

I propose to combine this into a single script, as the use case is similar and running it all at once will allow for some optimizations. Turning off parts of the script as an option should be possible.

Manual reverts

We could reuse the code that's present in EditResultBuilder and construct the object for every revision on a page. That would have horrible performance, though. I think it would be faster to obtain revisions in batches from the DB and then operate on a in-memory array of them. We would need just rev_id and rev_sha1 fields really, so we probably don't have to grab entire rows.

Reverted edits

To mark reverted edits, we would have to detect the revert first, so it would only be possible to detect manual reverts, rollbacks and exact undos. Then we could mark the reverted edit appropriately.

Usage

I'm not sure if Wikipedia would be interested in using such a script, as it would be probably quite slow for them. I'm also not sure if it should be included as a part of 1.35 to 1.36 update by default. I guess keeping it as an option to wikis and announcing it in release notes would be fine as well.

Event Timeline

Change 616579 had a related patch set uploaded (by Ostrzyciel; owner: Ostrzyciel):
[mediawiki/core@master] WIP: maintenance script for filling manual-revert and reverted tags

https://gerrit.wikimedia.org/r/616579

I recently came back to this and I'm no longer sure if I can implement it that easily. The problem is T259103: Run reverted tag update job only after the edit is approved – there is now a complex mechanism for approving the reverted tag which depends on the patrolling subsystem, user's rights and possibly extensions. To properly emulate all that behavior in a script would be... a very significant challenge, if not impossible, given that there's no detailed record of patrol actions.

Some alternatives off the top of my head:

  1. Allow the script operator to specify a certain group of people that should have their reverts apply the reverted tag. That can be a certain user group (like sysops or rollbackers) or maybe some whitelist including ex-admins. That would work only for small wikis, though. I can imagine that it would be effective in cases where you have had literally 10 admins in the history of the wiki.
  2. Forget about the simple implementation of manual reverts, construct revert graphs instead. We could use them to figure out which edits were never reverted and don't mark those as reverted. That will however break, a lot, in a dozens of really fun ways.
  3. Give up on the script altogether.