Add language support for Malay language (ms)
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Hakimi97
	Oct 29 2023, 1:08 AM

Description

Run Bad-Words-Detection-System to get potential badword list
Human review of BWDS list
Integrate into revscoring

Event Timeline

Hakimi97 created this task.Oct 29 2023, 1:08 AM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 29 2023, 1:09 AM

Maintenance_bot added a project: artificial-intelligence.Oct 29 2023, 1:29 AM

For reference purpose:
https://meta.wikimedia.org/wiki/Research:Revision_scoring_as_a_service/Word_lists/ms

@calbon Can you weigh in on this? AIUI, this would be nontrivial update to Revscoring.

klausman moved this task from Unsorted to Ready To Go on the Machine-Learning-Team board.Nov 14 2023, 3:40 PM

@Hakimi97 Hi! The ML team doesn't plan to support/expand revscoring-based models in the future, we and the Research team suggest to try the Revert Risk models instead. There are two variants:

Language Agnostic - https://meta.wikimedia.org/wiki/Machine_learning_models/Proposed/Language-agnostic_revert_risk
Multi-Lingual - https://meta.wikimedia.org/wiki/Machine_learning_models/Proposed/Multilingual_revert_risk

As far as I can see the Language Agnostic model supports the ms language, so would you be open to give it a try?

https://api.wikimedia.org/wiki/Lift_Wing_API/Reference/Get_reverted_risk_language_agnostic_prediction

In T349968#9336577, @elukey wrote:

@Hakimi97 Hi! The ML team doesn't plan to support/expand revscoring-based models in the future, we and the Research team suggest to try the Revert Risk models instead. There are two variants:

Language Agnostic - https://meta.wikimedia.org/wiki/Machine_learning_models/Proposed/Language-agnostic_revert_risk

Multi-Lingual - https://meta.wikimedia.org/wiki/Machine_learning_models/Proposed/Multilingual_revert_risk

As far as I can see the Language Agnostic model supports the ms language, so would you be open to give it a try?

https://api.wikimedia.org/wiki/Lift_Wing_API/Reference/Get_reverted_risk_language_agnostic_prediction

@elukey Yes sure, I would like to try the Language Agnostic model for ms language (specifically mswiki) if possible. But where should I start to try the model?

@Hakimi97 in the following link you can find some examples:

https://api.wikimedia.org/wiki/Lift_Wing_API/Reference/Get_reverted_risk_language_agnostic_prediction

We also have https://wikitech.wikimedia.org/wiki/Machine_Learning/LiftWing/Usage with a lot of information, but I'd suggest to start with the first link that is more self-contained. Let us know if you have any other doubts!

@elukey I have tried to run using Python for both Revert Risk Language Agnostic and Multilingual Revert Risk models, and both of them work well! Thank you for the support, I will let you know if there is any doubt later.

@Hakimi97 nice! Do you want to keep this task open or should we close? (feel free to contact us on Libera irc at #wikimedia-ml or with another Phabricator task anytime of course).

elukey closed this task as Resolved.Nov 21 2023, 3:45 PM

calbon moved this task from Ready To Go to 2023-2024 Q3 Done on the Machine-Learning-Team board.Nov 29 2023, 2:16 PM

Add language support for Malay language (ms)Closed, ResolvedPublicActions

Description

Event Timeline

Add language support for Malay language (ms)
Closed, ResolvedPublic
Actions