Add a check to determine if a pending edit has already been reviewed by detecting if it's a revert to a previously reviewed version.
Logic:
- Check if there is a newer version in the edit history with a change tag indicating the edit is a revert or has been reverted (tags: mw-manual-revert, mw-reverted, mw-rollback, mw-undo)
- If such versions exist, check if those versions have already-reviewed identical versions in the version history
- Match versions using SHA1 content hashes to identify identical content
- If an identical reviewed version is found, the edit can be treated accordingly (e.g., auto-approved)
Performance Considerations:
The provided SQL query fetches maximum revision IDs with reviews for all pending articles at once when data is refreshed. This query performs well on small wikis but is too slow on large wikis (ruwiki, plwiki, dewiki). Because this add a configuration setting to enable/disable this feature based on wiki size until there is better alternative.
SQL query
Following SQL query can be used for fetching latest revision id:s for all pending articles.
SELECT
fr_page_id,
fr_rev_id,
max(r1.rev_id) AS max_reviewed_rev_id,
c1.content_sha1,
page_title
FROM
page,
flaggedpages,
flaggedrevs,
slots AS s1,
content AS c1,
content AS c2,
slots AS s2,
revision AS r1,
change_tag,
change_tag_def
WHERE
fp_pending_since IS NOT NULL
AND fp_page_id = fr_page_id
AND fr_rev_id = s1.slot_revision_id
AND c1.content_id = s1.slot_content_id
AND c1.content_size = c2.content_size
AND c1.content_sha1 = c2.content_sha1
AND c1.content_id != c2.content_id
AND c2.content_id = s2.slot_content_id
AND s2.slot_revision_id = r1.rev_id
AND r1.rev_page = fp_page_id
AND r1.rev_id > fp_stable
AND r1.rev_id = ct_rev_id
AND ct_tag_id = ctd_id
AND ctd_name IN ("mw-manual-revert", "mw-reverted", "mw-rollback", "mw-undo")
AND page_namespace = 0
AND fp_page_id = page_id
GROUP BY fp_page_id;Example for SQL query can be found here
Configuration:
Add a configuration parameter to enable/disable this check (recommended: disabled by default for large wikis).
Tests:
- Test SHA1 matching logic
- Test that Superset SQL query is working
- Test that finding latest reviewed version works correctly
- Test configuration parameter handling