Change Details

# Background **Product overview document:** https://docs.google.com/document/d/1rUzRNBGKi7Vi9RS4vVXaNyNUzqc99-xvkTbmsz0FkC8/edit ---- //If we enable communities to automatically prevent or revert obvious vandalism, moderators will have more time to spend on other activities.// ---- ## Goals - Reduce moderation backlogs by preventing bad edits from entering patroller queues. - Give moderators confidence that automoderation is reliable and is not producing significant false positives. - Ensure that editors caught in a false positive have clear avenues to flag the error / have their edit reinstated. Further user stories are documented [[ https://docs.google.com/document/d/1rUzRNBGKi7Vi9RS4vVXaNyNUzqc99-xvkTbmsz0FkC8/edit#heading=h.mktmvkdvaa0n | here ]]. ## Helpful links - [[ https://wikitech.wikimedia.org/wiki/Machine_Learning/LiftWing | LiftWing ]] ([[ https://wikitech.wikimedia.org/wiki/Machine_Learning/LiftWing/Usage | Usage ]]) - Models: [[ https://meta.wikimedia.org/wiki/Machine_learning_models/Proposed/Multilingual_revert_risk | Multilingual ]] / [[ https://meta.wikimedia.org/wiki/Machine_learning_models/Proposed/Language-agnostic_revert_risk | Language-agnostic ]] ## Anti-vandalism bots | Bot | Code repository | ClueBot NG | https://github.com/cluebotng | SeroBOT | https://github.com/dennistobar/serobot | ChenzwBot | https://gitlab.com/antivandalbot-ng | Рейму Хакурей | https://github.com/Saisengen/wikibots/blob/main/other-bots/vand-rollbacker-DB.cs | PatrocleBot | https://github.com/rowiki/oresreverter # Investigation We want to investigate the technical approach we might take for Automoderator at a high level, answering questions such as: - Should this be a MediaWiki extension, or some kind of Cloud-hosted tool? - How should we approach community configuration? Will we aim to use the Growth team's Community Configuration toolset? - Are we likely to have any technical requests for the Machine Learning platform team for our use of LiftWing? - As a tool which will be actively editing Wikimedia projects, are there any development principles we can set to ensure that we minimise the introduction of breaking changes as we iterate on it? Engineers should feel free to tackle any other high-level questions they might have about our approach beyond the above. ## Findings: > Should this be a MediaWiki extension, or some kind of Cloud-hosted tool? This should be an external tool. It will will need to take each revision on a given project as input. Because of the need to effectively scale, that means subscribing to a stream of filtered events and requesting a score for each one, either via changeprop or flink. An extension would mainly be beneficial if we were using hooks to fire those requests after the creation of a revision and would come with the downsides of being tied to the train process for deployment and sharing resources with the production site. > How should we approach community configuration? Will we aim to use the Growth team's Community Configuration toolset? Since this will be off-wiki, we won't be using Community Config. Enabling the tool for a project will mean updating our filter in whatever event system we subscribe to, but we'll want to create an interface for configuring the thresholds within each active project. This could be as simple as webform with a couple of numeric input fields that only allows access to a small set of users via oauth. > Are we likely to have any technical requests for the Machine Learning platform team for our use of LiftWing? I think we will have questions come up as we implement, but not any that require special consideration; they have been very responsive to one-off queries. > As a tool which will be actively editing Wikimedia projects, are there any development principles we can set to ensure that we minimise the introduction of breaking changes as we iterate on it? Yes. Especially at the beginning, we should design to avoid false-positives even over efficacy. We should hardcode some guardrails that cannot be overriden with configuration. The guardrails should be internal implementations of whatever module/class is doing the thing that needs the guardrail. For example: - We should disallow revert thresholds below a designated "safe" value, such as 90% revert risk probability. To help protect ourselves while developing the tool, the code that actually does the reverting could be in a separate class/module internally hard codes this limit. If the guardrail is private, then it is less likely to be accidentally overriden by another class/module that calls for a revert. - If we support multiple thresholds (eg, an additional "marginal" threshold in which the tool takes a non-revert action, such as tagging or sending a notification, etc), we should not allow the thesholds to overlap Warnings/Errors should be raised when any configuration or code bumps into a guardrail (e.g if it would perform a "bad" action without the guardrail in place) The tool should be able to be disabled rapidly, and the moderator community should have the ability to use this feature in case of a spike in false-positives. We should consider the possibility of circular reverts/ revert wars, e.g., if the tool reverts a revision, and then a human overrules and reverts the tool's revert, perhaps it should not go into a loop of reverting the revert of a revert. That might be a case for the tool to take another action, such as sending a notification. Business logic should always default to no action. Pardon the pseudo-code, but as an example: do this: ``` switch ( score ) { case > 0.90: revert() break default: return false } ``` instead of: ``` switch ( score ) { case < 0.90: return false default: revert() } ```

# Background **Product overview document:** https://docs.google.com/document/d/1rUzRNBGKi7Vi9RS4vVXaNyNUzqc99-xvkTbmsz0FkC8/edit ---- //If we enable communities to automatically prevent or revert obvious vandalism, moderators will have more time to spend on other activities.// ---- ## Goals - Reduce moderation backlogs by preventing bad edits from entering patroller queues. - Give moderators confidence that automoderation is reliable and is not producing significant false positives. - Ensure that editors caught in a false positive have clear avenues to flag the error / have their edit reinstated. Further user stories are documented [[ https://docs.google.com/document/d/1rUzRNBGKi7Vi9RS4vVXaNyNUzqc99-xvkTbmsz0FkC8/edit#heading=h.mktmvkdvaa0n | here ]]. ## Helpful links - [[ https://wikitech.wikimedia.org/wiki/Machine_Learning/LiftWing | LiftWing ]] ([[ https://wikitech.wikimedia.org/wiki/Machine_Learning/LiftWing/Usage | Usage ]]) - Models: [[ https://meta.wikimedia.org/wiki/Machine_learning_models/Proposed/Multilingual_revert_risk | Multilingual ]] / [[ https://meta.wikimedia.org/wiki/Machine_learning_models/Proposed/Language-agnostic_revert_risk | Language-agnostic ]] ## Anti-vandalism bots | Bot | Code repository | ClueBot NG | https://github.com/cluebotng | SeroBOT | https://github.com/dennistobar/serobot | ChenzwBot | https://gitlab.com/antivandalbot-ng | Рейму Хакурей | https://github.com/Saisengen/wikibots/blob/main/other-bots/vand-rollbacker-DB.cs | PatrocleBot | https://github.com/rowiki/oresreverter # Investigation We want to investigate the technical approach we might take for Automoderator at a high level, answering questions such as: - Should this be a MediaWiki extension, or some kind of Cloud-hosted tool? - How should we approach community configuration? Will we aim to use the Growth team's Community Configuration toolset? - Are we likely to have any technical requests for the Machine Learning platform team for our use of LiftWing? - As a tool which will be actively editing Wikimedia projects, are there any development principles we can set to ensure that we minimise the introduction of breaking changes as we iterate on it? Engineers should feel free to tackle any other high-level questions they might have about our approach beyond the above. ## Findings > Should this be a MediaWiki extension, or some kind of Cloud-hosted tool? The part that actually does the reverting (or not) should be an external tool. It will will need to take each revision on a given project as input. Because of the need to effectively scale, that means subscribing to a stream of filtered events and requesting a score for each one, either via changeprop or flink. An extension would mainly be beneficial if we were using hooks to fire those requests after the creation of a revision and would come with the downsides of being tied to the train process for deployment and sharing resources with the production site. We could use a helper extension for user facing workflows such as notifications and reporting if needed. > How should we approach community configuration? Will we aim to use the Growth team's Community Configuration toolset? Since this will be off-wiki, we won't be using Community Config. Enabling the tool for a project will mean updating our filter in whatever event system we subscribe to, but we'll want to create an interface for configuring the thresholds within each active project. This could be as simple as webform with a couple of numeric input fields that only allows access to a small set of users via oauth. > Are we likely to have any technical requests for the Machine Learning platform team for our use of LiftWing? I think we will have questions come up as we implement, but not any that require special consideration; they have been very responsive to one-off queries. > As a tool which will be actively editing Wikimedia projects, are there any development principles we can set to ensure that we minimise the introduction of breaking changes as we iterate on it? Yes. Especially at the beginning, we should design to avoid false-positives even over efficacy. We should hardcode some guardrails that cannot be overriden with configuration. The guardrails should be internal implementations of whatever module/class is doing the thing that needs the guardrail. For example: - We should disallow revert thresholds below a designated "safe" value, such as 90% revert risk probability. To help protect ourselves while developing the tool, the code that actually does the reverting could be in a separate class/module internally hard codes this limit. If the guardrail is private, then it is less likely to be accidentally overriden by another class/module that calls for a revert. - If we support multiple thresholds (eg, an additional "marginal" threshold in which the tool takes a non-revert action, such as tagging or sending a notification, etc), we should not allow the thesholds to overlap Warnings/Errors should be raised when any configuration or code bumps into a guardrail (e.g if it would perform a "bad" action without the guardrail in place) The tool should be able to be disabled rapidly, and the moderator community should have the ability to use this feature in case of a spike in false-positives. We should consider the possibility of circular reverts/ revert wars, e.g., if the tool reverts a revision, and then a human overrules and reverts the tool's revert, perhaps it should not go into a loop of reverting the revert of a revert. That might be a case for the tool to take another action, such as sending a notification. Business logic should always default to no action. Pardon the pseudo-code, but as an example: do this: ``` switch ( score ) { case > 0.90: revert() break default: return false } ``` instead of: ``` switch ( score ) { case < 0.90: return false default: revert() } ``` ## Additional Q&A > what led you to feel that an external tool would be the best option? - Running the tool outside of mediawiki would let us setup a robust container-based service that is decoupled from mediawiki and consumes streams of revisions. - We've done something similar in cloud services before, but there are questions about how the service should be run: - can we run it on toolforge (which runs lots of tools, but does tend to have some downtime) or on the production kube cluster? (which runs other non-mediawiki production services such as eventbus, parsoid, etc) either one would help minimize our maintenance load - if we run in a cloud services project, how are we going to maintain the service with two engineers? - There isn't really a good way to run a service within an extension so far as I can see. The onWiki options for doing things with reverts would be: - using the RevisionFromEditComplete hook to make a non-blocking function that calls the revert risk model on the completed revision and processes it. This is hanging a lot of network-bound code off of a page request, and is asking for problems, IMO. - polling the api via a job and then calling the revert risk model and processing. This would be slower than streaming and I think is still ultimately tied to page requests because of how jobs run. Even more network traffic than the above option. > How much worse of a situation in terms of timeline or effort would it be for us to build an extension? With the core functionality of reverting edits, it's not about timelines; shoehorning something that merits a long-running service into a function that gets repeatedly triggered by page requests is not a good design and we shouldn't do it. *however*, I see no reason to avoid an extension where it serves a purpose; I could see using an extension for notifications of reverts and reporting; it would also require less setup for i18n for any user-facing stuff we put in there. > We'll need to evaluate the UX tradeoffs of being on- or off-wiki in a little more detail: let me loop through some user stories. Of course, these are all just my initial assessments: >As a moderator, I want to configure Automoderator with thresholds and settings that my community has agreed on, so that we feel confident it is acting in the way we want. - onwiki vs offwiki - wash - For discoverability, I'm not sure that there will be a big difference between links to a special page and links to an externally hosted tool. Instead of talking about this in terms of where the code is hosted, we could talk about where the entry points need to be for discoverability. Even if we have a special page on-wiki, users won't know to go to it unless it is exposed to them. - If we determine that we really need config on-wiki, we could have the service be configurable via an api POST request and setup a form on a special page that sends that request on submit. We would need to keep those in sync. >As a moderator, I want Automoderator to only take actions on edits which it is qualified to make judgements on, so that the number of false positive reverts is minimized. - onwiki vs offwiki - wash - for performance reasons, some of the configured items may need to be setup in the function that filters and consumes the stream. We'll need to do another round of investigation on flink vs changeprop to see if one has an advantage there in terms of configurability, but neither one of them runs on-wiki. >As a moderator, I want to test different Automoderator settings against recent edits, so that I can understand what will happen when I save configuration changes. - onwiki vs offwiki - offwiki - we have some existing work on this available, and it's off wiki >As a new good faith editor, I want to know when Automoderator has reverted one of my edits and be given clear steps for reporting the false positive, so that I can have my edit reinstated. - onwiki vs offwiki - onwiki looks like the best at first blush, but it might be down to what our moderator communities want. I think more user research is warranted. - question: (how) are we going to handle temp/ip users? - talk page approach: - onwiki vs offwiki - wash - editcheck approach: - onwiki vs offwiki - onwiki >As a moderator, I want to review false positive reports from new editors, so that I can reinstate good edits which shouldn’t have been reverted. - onwiki vs offwiki - onwiki >As a Wikimedia Foundation researcher, I want false positive report data to be available to me so that I can retrain the model and make it more accurate. - onwiki vs offwiki - too early to call - this is so wide open right now that it's too early to call the best approach.

# Background **Product overview document:** https://docs.google.com/document/d/1rUzRNBGKi7Vi9RS4vVXaNyNUzqc99-xvkTbmsz0FkC8/edit ---- //If we enable communities to automatically prevent or revert obvious vandalism, moderators will have more time to spend on other activities.// ---- ## Goals - Reduce moderation backlogs by preventing bad edits from entering patroller queues. - Give moderators confidence that automoderation is reliable and is not producing significant false positives. - Ensure that editors caught in a false positive have clear avenues to flag the error / have their edit reinstated. Further user stories are documented [[ https://docs.google.com/document/d/1rUzRNBGKi7Vi9RS4vVXaNyNUzqc99-xvkTbmsz0FkC8/edit#heading=h.mktmvkdvaa0n | here ]]. ## Helpful links - [[ https://wikitech.wikimedia.org/wiki/Machine_Learning/LiftWing | LiftWing ]] ([[ https://wikitech.wikimedia.org/wiki/Machine_Learning/LiftWing/Usage | Usage ]]) - Models: [[ https://meta.wikimedia.org/wiki/Machine_learning_models/Proposed/Multilingual_revert_risk | Multilingual ]] / [[ https://meta.wikimedia.org/wiki/Machine_learning_models/Proposed/Language-agnostic_revert_risk | Language-agnostic ]] ## Anti-vandalism bots | Bot | Code repository | ClueBot NG | https://github.com/cluebotng | SeroBOT | https://github.com/dennistobar/serobot | ChenzwBot | https://gitlab.com/antivandalbot-ng | Рейму Хакурей | https://github.com/Saisengen/wikibots/blob/main/other-bots/vand-rollbacker-DB.cs | PatrocleBot | https://github.com/rowiki/oresreverter # Investigation We want to investigate the technical approach we might take for Automoderator at a high level, answering questions such as: - Should this be a MediaWiki extension, or some kind of Cloud-hosted tool? - How should we approach community configuration? Will we aim to use the Growth team's Community Configuration toolset? - Are we likely to have any technical requests for the Machine Learning platform team for our use of LiftWing? - As a tool which will be actively editing Wikimedia projects, are there any development principles we can set to ensure that we minimise the introduction of breaking changes as we iterate on it? Engineers should feel free to tackle any other high-level questions they might have about our approach beyond the above. ## Findings: > Should this be a MediaWiki extension, or some kind of Cloud-hosted tool? ThisThe part that actually does the reverting (or not) should be an external tool. It will will need to take each revision on a given project as input. Because of the need to effectively scale, that means subscribing to a stream of filtered events and requesting a score for each one, either via changeprop or flink. An extension would mainly be beneficial if we were using hooks to fire those requests after the creation of a revision and would come with the downsides of being tied to the train process for deployment and sharing resources with the production site. An extension would mainly be beneficial if we were using hooks to fire those requests after the creation of a revision and would come with the downsides of being tied to the train process for deployment and sharing resources with the production siteWe could use a helper extension for user facing workflows such as notifications and reporting if needed. > How should we approach community configuration? Will we aim to use the Growth team's Community Configuration toolset? Since this will be off-wiki, we won't be using Community Config. Enabling the tool for a project will mean updating our filter in whatever event system we subscribe to, but we'll want to create an interface for configuring the thresholds within each active project. This could be as simple as webform with a couple of numeric input fields that only allows access to a small set of users via oauth. > Are we likely to have any technical requests for the Machine Learning platform team for our use of LiftWing? I think we will have questions come up as we implement, but not any that require special consideration; they have been very responsive to one-off queries. > As a tool which will be actively editing Wikimedia projects, are there any development principles we can set to ensure that we minimise the introduction of breaking changes as we iterate on it? Yes. Especially at the beginning, we should design to avoid false-positives even over efficacy. We should hardcode some guardrails that cannot be overriden with configuration. The guardrails should be internal implementations of whatever module/class is doing the thing that needs the guardrail. For example: - We should disallow revert thresholds below a designated "safe" value, such as 90% revert risk probability. To help protect ourselves while developing the tool, the code that actually does the reverting could be in a separate class/module internally hard codes this limit. If the guardrail is private, then it is less likely to be accidentally overriden by another class/module that calls for a revert. - If we support multiple thresholds (eg, an additional "marginal" threshold in which the tool takes a non-revert action, such as tagging or sending a notification, etc), we should not allow the thesholds to overlap Warnings/Errors should be raised when any configuration or code bumps into a guardrail (e.g if it would perform a "bad" action without the guardrail in place) The tool should be able to be disabled rapidly, and the moderator community should have the ability to use this feature in case of a spike in false-positives. We should consider the possibility of circular reverts/ revert wars, e.g., if the tool reverts a revision, and then a human overrules and reverts the tool's revert, perhaps it should not go into a loop of reverting the revert of a revert. That might be a case for the tool to take another action, such as sending a notification. Business logic should always default to no action. Pardon the pseudo-code, but as an example: do this: ``` switch ( score ) { case > 0.90: revert() break default: return false } ``` instead of: ``` switch ( score ) { case < 0.90: return false default: revert() } ``` ## Additional Q&A > what led you to feel that an external tool would be the best option? - Running the tool outside of mediawiki would let us setup a robust container-based service that is decoupled from mediawiki and consumes streams of revisions. - We've done something similar in cloud services before, but there are questions about how the service should be run: - can we run it on toolforge (which runs lots of tools, but does tend to have some downtime) or on the production kube cluster? (which runs other non-mediawiki production services such as eventbus, parsoid, etc) either one would help minimize our maintenance load - if we run in a cloud services project, how are we going to maintain the service with two engineers? - There isn't really a good way to run a service within an extension so far as I can see. The onWiki options for doing things with reverts would be: - using the RevisionFromEditComplete hook to make a non-blocking function that calls the revert risk model on the completed revision and processes it. This is hanging a lot of network-bound code off of a page request, and is asking for problems, IMO. - polling the api via a job and then calling the revert risk model and processing. This would be slower than streaming and I think is still ultimately tied to page requests because of how jobs run. Even more network traffic than the above option. > How much worse of a situation in terms of timeline or effort would it be for us to build an extension? With the core functionality of reverting edits, it's not about timelines; shoehorning something that merits a long-running service into a function that gets repeatedly triggered by page requests is not a good design and we shouldn't do it. *however*, I see no reason to avoid an extension where it serves a purpose; I could see using an extension for notifications of reverts and reporting; it would also require less setup for i18n for any user-facing stuff we put in there. > We'll need to evaluate the UX tradeoffs of being on- or off-wiki in a little more detail: let me loop through some user stories. Of course, these are all just my initial assessments: >As a moderator, I want to configure Automoderator with thresholds and settings that my community has agreed on, so that we feel confident it is acting in the way we want. - onwiki vs offwiki - wash - For discoverability, I'm not sure that there will be a big difference between links to a special page and links to an externally hosted tool. Instead of talking about this in terms of where the code is hosted, we could talk about where the entry points need to be for discoverability. Even if we have a special page on-wiki, users won't know to go to it unless it is exposed to them. - If we determine that we really need config on-wiki, we could have the service be configurable via an api POST request and setup a form on a special page that sends that request on submit. We would need to keep those in sync. >As a moderator, I want Automoderator to only take actions on edits which it is qualified to make judgements on, so that the number of false positive reverts is minimized. - onwiki vs offwiki - wash - for performance reasons, some of the configured items may need to be setup in the function that filters and consumes the stream. We'll need to do another round of investigation on flink vs changeprop to see if one has an advantage there in terms of configurability, but neither one of them runs on-wiki. >As a moderator, I want to test different Automoderator settings against recent edits, so that I can understand what will happen when I save configuration changes. - onwiki vs offwiki - offwiki - we have some existing work on this available, and it's off wiki >As a new good faith editor, I want to know when Automoderator has reverted one of my edits and be given clear steps for reporting the false positive, so that I can have my edit reinstated. - onwiki vs offwiki - onwiki looks like the best at first blush, but it might be down to what our moderator communities want. I think more user research is warranted. - question: (how) are we going to handle temp/ip users? - talk page approach: - onwiki vs offwiki - wash - editcheck approach: - onwiki vs offwiki - onwiki >As a moderator, I want to review false positive reports from new editors, so that I can reinstate good edits which shouldn’t have been reverted. - onwiki vs offwiki - onwiki >As a Wikimedia Foundation researcher, I want false positive report data to be available to me so that I can retrain the model and make it more accurate. - onwiki vs offwiki - too early to call - this is so wide open right now that it's too early to call the best approach.