Page MenuHomePhabricator

Improve MT support for Central Bikol with OpusMT
Closed, ResolvedPublic

Description

Now that OpusMT is integrated in Content Translation (T234194), we are looking for languages with no previous Machine Translation (MT) support that could benefit from it. OpusMT is an opensource neural machine translation system that is trained with freely licensed multilingual contents (including articles created with Content Translation).

Enabling OpusMT is an experimental process since the initial quality is expected to be low, but since it improves as more translations are created, it is also a way for the community to improve the MT support for their language by translating Wikipedia articles using Content Translation. So it may be an interesting process for communities to engage in.

In the initial iteration we enabled OpusMT for Assamese, and Central Bikol seems a good candidate to consider next since it is an active community with no current MT support. In the same way as we did with Assamese, we want to involve the Central Bikol community. We only plan enable OpusMT is there are no concerns from the community. Based on the observations of how the process goes, we'll be open to adjust (e.g., make more strict the limits for publishing) or disable the system as needed.

Event Timeline

Pginer-WMF triaged this task as Medium priority.Sep 8 2020, 10:20 AM

@KartikMistry would it be possible to enable English->Central Bikol on the OpusMT test instance to allow editors to check when we inform the community?

@KartikMistry would it be possible to enable English->Central Bikol on the OpusMT test instance to allow editors to check when we inform the community?

Yes. We already configured Central Bikol in OpusMT. Need to update cxserver config to be available for users.

@KartikMistry would it be possible to enable English->Central Bikol on the OpusMT test instance to allow editors to check when we inform the community?

Yes. We already configured Central Bikol in OpusMT. Need to update cxserver config to be available for users.

Enabling it for users in Content Translation has to wait until we complete the consultation.
However, in the test insence it needs to be visible to users for them to use it. currently it does not appear in the options:

Screenshot 2020-10-05 at 10.25.28.png (485×1 px, 98 KB)

Change 642925 had a related patch set uploaded (by KartikMistry; owner: KartikMistry):
[mediawiki/services/cxserver@master] Config: Add English->Central Bikol MT support

https://gerrit.wikimedia.org/r/642925

Change 642925 merged by jenkins-bot:
[mediawiki/services/cxserver@master] Config: Add English->Central Bikol MT support

https://gerrit.wikimedia.org/r/642925

Change 643139 had a related patch set uploaded (by KartikMistry; owner: KartikMistry):
[operations/deployment-charts@master] Update cxserver to 2020-11-23-050106-production

https://gerrit.wikimedia.org/r/643139

Change 643139 merged by jenkins-bot:
[operations/deployment-charts@master] Update cxserver to 2020-11-23-050106-production

https://gerrit.wikimedia.org/r/643139

Mentioned in SAL (#wikimedia-operations) [2020-11-30T04:26:22Z] <kart_> Updated cxserver to 2020-11-23-050106-production (T262253, T268410)

This is deployed in Production, however, we are getting some 500 errors for big paragraphs. I'll create subtask for that.