Page MenuHomePhabricator

Generate template parameter alignments for languages of interest to Section Translation
Closed, ResolvedPublic

Description

To improve template support a machine learning approach (T221211) was applied to identify the mappings/alignments of parameters for the most used templates and language pairs (generated alignments). We want to generate additional mappings of Urdu, Kurdish, Igbo, Haitian Creole, Czech, Slovak, Turkish, Xhosa, Zulu, Shona, Luganda, Hausa, Occitan, Yoruba, Bashkir, Azerbaijani, and Malay.

In particular, we want support these additional pairs:

Run 1:

  • en → az
  • en → ba
  • en → cs (Current mapping: 243)
  • en → ha
  • en → ht
  • en → ig
  • en → ku
  • en → lg
  • en → ms
  • en → oc
  • en → sk
  • en → sn
  • en → tr (Current mapping: 387)
  • en → ur (Current mapping: 82)
  • en → xh
  • en → yo
  • en → zu

Run 2:

  • ca → oc
  • cs → sk
  • fr → ht
  • fr → oc
  • ru → ba
  • sk → cs

Run 3:

  • simple → xh

These are languages used on wikis that may be interested in Section Translation. The set includes the target wikis where the Growth team is deploying their recent tools (which may integrate with Section Translation in the future).

Event Timeline

Pginer-WMF renamed this task from Generate template parameter alignments for Urdu, Kurdish, Igbo, Haitian Creole, Kikuyu, Czech, Slovak, and Turkish to Generate template parameter alignments for Urdu, Kurdish, Igbo, Haitian Creole, Kikuyu, Czech, Slovak, Turkish, Xhosa, Zulu, Shona, Luganda, Hausa, and Occitan.Sep 13 2021, 1:57 PM
Pginer-WMF updated the task description. (Show Details)
Pginer-WMF renamed this task from Generate template parameter alignments for Urdu, Kurdish, Igbo, Haitian Creole, Kikuyu, Czech, Slovak, Turkish, Xhosa, Zulu, Shona, Luganda, Hausa, and Occitan to Generate template parameter alignments for languages of interest to Section Translation.Mar 29 2022, 10:46 AM
Pginer-WMF updated the task description. (Show Details)
KartikMistry updated the task description. (Show Details)
KartikMistry updated the task description. (Show Details)

Result of first set:

sqlite> select count(*) from templates where source_lang='en' and target_lang='az';
188
sqlite> select count(*) from templates where source_lang='en' and target_lang='ba';
64
sqlite> select count(*) from templates where source_lang='en' and target_lang='ha';
18
sqlite> select count(*) from templates where source_lang='en' and target_lang='ht';
3
sqlite> select count(*) from templates where source_lang='en' and target_lang='ig';
7
sqlite> select count(*) from templates where source_lang='en' and target_lang='ku';
38
sqlite> select count(*) from templates where source_lang='en' and target_lang='lg';
4
sqlite> select count(*) from templates where source_lang='en' and target_lang='ms';
307
sqlite> select count(*) from templates where source_lang='en' and target_lang='oc';
8
sqlite> select count(*) from templates where source_lang='en' and target_lang='sk';
58
sqlite> select count(*) from templates where source_lang='en' and target_lang='sn';
5
sqlite> select count(*) from templates where source_lang='en' and target_lang='xh';
8
sqlite> select count(*) from templates where source_lang='en' and target_lang='yo';
50
sqlite> select count(*) from templates where source_lang='en' and target_lang='zu';
14

Change 793768 had a related patch set uploaded (by KartikMistry; author: KartikMistry):

[mediawiki/services/cxserver@master] Added template parameter alignments for 14 pairs

https://gerrit.wikimedia.org/r/793768

Change 793768 merged by jenkins-bot:

[mediawiki/services/cxserver@master] Added template parameter alignments for 14 pairs

https://gerrit.wikimedia.org/r/793768

Along with planned template parameter alignment generation, we've also improved results for some already done pairs as a side result:

  • en > ba: Old: 64, New: 65
  • en > cs: Old: 243, New: 244
  • en > fr: Old: 190, New: 220
  • en > ru: Old: 371, New: 431
  • fr > en: Old: 133, New: 150
  • ru > ca: Old: 137, New: 145
  • ru > en: Old: 298, New: 306
  • ru > fr: Old: 59, New: 71

New pairs:

  • ca > oc: 8
  • cs > sk: 60
  • fr > ht: 3
  • ru > ba: 149
  • sk > cs: 38

Change 794598 had a related patch set uploaded (by KartikMistry; author: KartikMistry):

[mediawiki/services/cxserver@master] Added template parameter alignments for simple -> xh pair

https://gerrit.wikimedia.org/r/794598

Change 794598 merged by jenkins-bot:

[mediawiki/services/cxserver@master] Added template parameter alignments for simple -> xh pair

https://gerrit.wikimedia.org/r/794598

Change 794890 had a related patch set uploaded (by KartikMistry; author: KartikMistry):

[operations/deployment-charts@master] Update cxserver to 2022-05-22-062659-production

https://gerrit.wikimedia.org/r/794890

Change 794890 merged by jenkins-bot:

[operations/deployment-charts@master] Update cxserver to 2022-05-22-062659-production

https://gerrit.wikimedia.org/r/794890

Mentioned in SAL (#wikimedia-operations) [2022-05-23T05:33:01Z] <kart_> Updated cxserver to 2022-05-22-062659-production (T290847)