- Run Bad-Words-Detection-System to get potential badword list
- Human review of BWDS list
- Integrate into revscoring
Description
Event Timeline
@calbon Can you weigh in on this? AIUI, this would be nontrivial update to Revscoring.
@Hakimi97 Hi! The ML team doesn't plan to support/expand revscoring-based models in the future, we and the Research team suggest to try the Revert Risk models instead. There are two variants:
- Language Agnostic - https://meta.wikimedia.org/wiki/Machine_learning_models/Proposed/Language-agnostic_revert_risk
- Multi-Lingual - https://meta.wikimedia.org/wiki/Machine_learning_models/Proposed/Multilingual_revert_risk
As far as I can see the Language Agnostic model supports the ms language, so would you be open to give it a try?
@elukey Yes sure, I would like to try the Language Agnostic model for ms language (specifically mswiki) if possible. But where should I start to try the model?
@Hakimi97 in the following link you can find some examples:
We also have https://wikitech.wikimedia.org/wiki/Machine_Learning/LiftWing/Usage with a lot of information, but I'd suggest to start with the first link that is more self-contained. Let us know if you have any other doubts!
@elukey I have tried to run using Python for both Revert Risk Language Agnostic and Multilingual Revert Risk models, and both of them work well! Thank you for the support, I will let you know if there is any doubt later.
@Hakimi97 nice! Do you want to keep this task open or should we close? (feel free to contact us on Libera irc at #wikimedia-ml or with another Phabricator task anytime of course).