Page MenuHomePhabricator

Enable MinT for all languages supported by IndicTrans2
Closed, ResolvedPublic

Description

After the initial exploration of IndicTrans2 (T337656), we want to enable MinT (list of current language support) for all the languages supported:

  1. Assamese (as/asm_Beng) — ✅ Enabled as part of T337656
  2. Bangla (bn/ben_Beng) — ✅ Enabled as part of T337656
  3. Hindi (hi/hin_Deva) — ✅ Enabled as part of T337656
  4. Kashmiri (ks/kas_Arab) — ✅ Enabled as part of T337656
  5. Santali (sat/sat_Olck) — ✅ Enabled as part of T337656
  6. Goan (gom/gom_Deva)
  7. Gujarati (gu/guj_Gujr)
  8. Kannada (kn/kan_Knda)
  9. Maithili (mai/mai_Deva)
  10. Malayalam (ml/mal_Mlym)
  11. Manipuri (mni/mni_Beng & mni_Mtei)
  12. Marathi (mr/mar_Deva)
  13. Nepali (ne/npi_Deva)
  14. Oriya (or/ory_Orya)
  15. Panjabi (pa/pan_Guru)
  16. Sanskrit (sa/san_Deva)
  17. Sindhi (sd/snd_Arab & snd_Deva)
  18. Tamil (ta/tam_Taml)
  19. Telugu (te/tel_Telu)
  20. Urdu (ur/urd_Arab)
  21. Bodo (brx/brx_Deva) No wiki yet. Enable to have it ready for the future (unless that could be problematic)
  22. Dogri (doi/doi_Deva) No wiki yet. Enable to have it ready for the future (unless that could be problematic)

Steps:

  • Enable MinT for selected languages
  • Communicate with affected communities (asking also about which service they prefer as the default).

Event Timeline

Pginer-WMF triaged this task as Medium priority.

Change 931870 had a related patch set uploaded (by KartikMistry; author: KartikMistry):

[mediawiki/services/cxserver@master] Enable MinT for all languages supported by IndicTrans2

https://gerrit.wikimedia.org/r/931870

Change 931870 merged by jenkins-bot:

[mediawiki/services/cxserver@master] Enable MinT for all languages supported by IndicTrans2

https://gerrit.wikimedia.org/r/931870

Change 932042 had a related patch set uploaded (by KartikMistry; author: KartikMistry):

[operations/deployment-charts@master] Update cxserver to 2023-06-21-112200-production

https://gerrit.wikimedia.org/r/932042

Change 932042 merged by jenkins-bot:

[operations/deployment-charts@master] Update cxserver to 2023-06-21-112200-production

https://gerrit.wikimedia.org/r/932042

Mentioned in SAL (#wikimedia-operations) [2023-06-22T06:39:01Z] <kart_> Updated cxserver to 2023-06-21-112200-production (T339896, T338123)

KartikMistry updated Other Assignee, added: UOzurumba.

@UOzurumba We can notify these communities about the availability of the MinT service and ask for their default preference. We will adjust the preference if we will receive any feedback on the other task. Once notification is done, we can mark this task as complete.

@KartikMistry, an adjustment in the configuration may be needed for the following languages:

  • Dogri (doi/doi_Deva)
  • Bodo (brx/brx_Deva)
  • Goan (gom/gom_Deva)

For these languages the configuration should capture only the language pairs involving English (not all combinations between them). The reason is that IndicTrans2 only supports translations from/to English and NLLB-200 (which supports all combinations) could not be used as a fallback for the above three languages since they are not supported by the model.

Change 932172 had a related patch set uploaded (by KartikMistry; author: KartikMistry):

[mediawiki/services/cxserver@master] MinT: Update IndicTrans2 only supported pairs

https://gerrit.wikimedia.org/r/932172

Change 932172 merged by jenkins-bot:

[mediawiki/services/cxserver@master] MinT: Update IndicTrans2 only supported pairs

https://gerrit.wikimedia.org/r/932172

@UOzurumba We can notify these communities about the availability of the MinT service and ask for their default preference. We will adjust the preference if we will receive any feedback on the other task. Once notification is done, we can mark this task as complete.

Thank you, @KartikMistry, I will make the announcement on Monday.

Change 932683 had a related patch set uploaded (by KartikMistry; author: KartikMistry):

[operations/deployment-charts@master] Update cxserver to 2023-06-26-050753-production

https://gerrit.wikimedia.org/r/932683

Change 932683 merged by jenkins-bot:

[operations/deployment-charts@master] Update cxserver to 2023-06-26-050753-production

https://gerrit.wikimedia.org/r/932683

Mentioned in SAL (#wikimedia-operations) [2023-06-26T06:28:05Z] <kart_> Updated cxserver to 2023-06-26-050753-production (T340236, T339896)

@KartikMistry, an adjustment in the configuration may be needed for the following languages:

  • Dogri (doi/doi_Deva)
  • Bodo (brx/brx_Deva)
  • Goan (gom/gom_Deva)

For these languages the configuration should capture only the language pairs involving English (not all combinations between them). The reason is that IndicTrans2 only supports translations from/to English and NLLB-200 (which supports all combinations) could not be used as a fallback for the above three languages since they are not supported by the model.

Thanks, @Pginer-WMF! This is fixed and deployed now.

Translation from English to Goan seems to be failing.
Going to https://translate.wmcloud.org/ and translaitng form English to "gom" results in perpetual loading state.
Trying in Section Translation shows MT not available error:

gom.m.wikipedia.org_w_index.php_title=Special_ContentTranslation&from=en&to=gom&page=Malvani%20cuisine&sx=true(iPhone SE).png (1×750 px, 170 KB)

JS Console shows:

cxserver.wikimedia.org/v2/translate/en/gom/MinT:1 Failed to load resource: the server responded with a status of 500 ()

Change 933095 had a related patch set uploaded (by KartikMistry; author: KartikMistry):

[mediawiki/services/machinetranslation@master] Add gom code to WIKI2NLLBCODES

https://gerrit.wikimedia.org/r/933095

Translation from English to Goan seems to be failing.
Going to https://translate.wmcloud.org/ and translaitng form English to "gom" results in perpetual loading state.
Trying in Section Translation shows MT not available error:

gom.m.wikipedia.org_w_index.php_title=Special_ContentTranslation&from=en&to=gom&page=Malvani%20cuisine&sx=true(iPhone SE).png (1×750 px, 170 KB)

JS Console shows:

cxserver.wikimedia.org/v2/translate/en/gom/MinT:1 Failed to load resource: the server responded with a status of 500 ()

Thanks for reporting, Pau! Submitted fix. It should be deployed tomorrow.

Change 933095 merged by jenkins-bot:

[mediawiki/services/machinetranslation@master] Add gom code to WIKI2NLLBCODES

https://gerrit.wikimedia.org/r/933095

Change 933221 had a related patch set uploaded (by KartikMistry; author: KartikMistry):

[operations/deployment-charts@master] Update MinT to 2023-06-27-053706-production

https://gerrit.wikimedia.org/r/933221

Change 933221 merged by jenkins-bot:

[operations/deployment-charts@master] Update MinT to 2023-06-27-053706-production

https://gerrit.wikimedia.org/r/933221

Mentioned in SAL (#wikimedia-operations) [2023-06-27T09:11:33Z] <kart_> Updated MinT to 2023-06-27-053706-production (T339896, T340236)

Translation from English to Goan seems to be failing.
Going to https://translate.wmcloud.org/ and translaitng form English to "gom" results in perpetual loading state.
Trying in Section Translation shows MT not available error:

gom.m.wikipedia.org_w_index.php_title=Special_ContentTranslation&from=en&to=gom&page=Malvani%20cuisine&sx=true(iPhone SE).png (1×750 px, 170 KB)

JS Console shows:

cxserver.wikimedia.org/v2/translate/en/gom/MinT:1 Failed to load resource: the server responded with a status of 500 ()

Thanks for reporting, Pau! Submitted fix. It should be deployed tomorrow.

Fixed and deployed.

@UOzurumba We can notify communities about the enablement of MinT now.

@UOzurumba We can notify communities about the enablement of MinT now.

@KartikMistry Done. Thanks!