A set of new languages are now available for Google Translate. As with past enablements, it may take some time until they are available in the external APIs. Once they are available we may want to enable the Google support in Content Translation. This ticket compiles the languages to enable. Below you can find them grouped by their current support on Wikipedia:
A) Languages with a Wikipedia and MT support already. We can enable the new support from Google as a non-default to provide them another option, with no need for specific coordination:
- ✅ Acehnese (ace)
- ✅ Avar/Avaric (av)
- ✅ Awadhi (awa)
- ✅ Balinese (ban)
- ✅ Bambara (bm)
- ✅ Bashkir (ba)
- ✅ Betawi (bew)
- ✅ Breton (br)
- ✅ Chamorro (ch)
- ✅ Chechen (ce)
- ✅ Chuvash (cv)
- ✅ Dinka (din)
- ✅ Dzongkha (dz)
- ✅ Faroese (fo)
- ✅ Fijian (fj)
- ✅ Fon (fon)
- ✅ Friulian (fur)
- ✅ Iloko/Ilocano (ilo)
- ✅ Jamaican Patois/Jamaican Creole English (jam)
- ✅ Kapampangan (pam)
- ✅ Komi (kv)
- ✅ Konkani (gom)
- ✅ Latgalian (ltg)
- ✅ Ligurian (lij)
- ✅ Limburgish (li)
- Lombard (lmo)
- Manx (gv)
- Meadow/Eastern Mari (mhr)
- Meiteilon/Manipuri (mni)
- Minang/Minangkabau (min)
- Nepalbhasa/Newari (new)
- Sepedi/Northern Sotho (nso)
- Occitan (oc)
- Odia (or)
- Ossetian (os)
- Pangasinan (pag)
- Papiamento (pap)
- Rundi (rn)
- Sango (sg)
- Shan (shn)
- Sicilian (scn)
- Silesian (szl)
- Swati (ss)
- Tahitian (ty)
- Tetum (tet)
- Tibetan (bo)
- Tok Pisin (tpi)
- Tongan (to)
- Tswana (tn)
- Tulu (tcy)
- Tumbuka (tum)
- Tuvan/Tuvinian (tyv)
- Udmurt (udm)
- Venda (ve)
- Venetian (vec)
- Western Punjabi (pnb). Google translate supports Punjabi using Shahmukhi script with the code pa-Arab.
- Wolof (wo)
- Yakut (sah)
- Waray (war)
Communication in progress
B) Languages with a Wikipedia but some open questions. We want to check with communities whether the MT support is useful (in bold those getting machine translation for the first time), or some other questions about the specific variant used:
- Abkhaz/Abkhazian (ab)
- Batak Toba (bbc)
- Cantonese (zh-yue)
- Kalaallisut (kl)
- Madurese (mad)
- NKo (nqo)
Northern Sami (se)- Do not enable for Northern Sami, a member of the community stated that the quality is poor and won't be useful for their work.
- Bikol Google uses code bik. Wikipedia uses bcl for Central Bikol, but is is unclear whether that is the variant supported by Google.
- Enable for Central Bikol. A contributor indicated that the MT will be useful in their Wikipedia.
- Crimean Tatar (crh). Google translate provides translations with Cyrillic script, Crimean Tatar Wikipedia uses both Latin and Cyrillic scripts using a converter, we may want to check if the Google support is useful for the community
- Fulani/Fula (ff) This language has several varieties with several language codes, we may need to check with the community whether the variant provided by Google Translate is useful.
- Kikongo (kg) We need to check with the community whether the variant provided by Google Translate is useful. In particular we may want to check if they find it useful to use the translations google provides for Kongo (kg), the ones provided for Kituba (ktu), or none of them
- Nahuatl (nah) Google uses code nhe. We need to check with the community whether the variant provided by Google Translate is useful
- Romani(rom) The Vlax Romani wikipedia uses rmy code. We need to check with the community whether the variant provided by Google Translate is useful
TamazightGoogle uses code ber and supports both Tifinagh and Latin scripts. Wikipedia uses zgh for Standard Moroccan Tamazight (using the Tifinagh script), but is is unclear whether that is the variant supported by Google.- Do not enable for Tamazight. A member of the community indicated that Google's variant (Kabyle) is not the same as the Amazigh language with code zgh, which is officially recognised in Morocco and used in the wiki.
C) Languages with no Wikipedia yet:
- Acholi (ach)
- Afar (aa) In Incubator
- Alur (alz)
- Baluchi (bal) In incubator with three projects for codes bgp, bgn, and bcc
- Baoulé (bci) In Incubator
- Batak Karo (btx)
- Batak Simalungun (bts)
- Bemba (bem)
- Buryat (bua)
- Chuukese (chk)
- Dari (prs) Google uses fa-AF code
- Dogri (doi in Google, dgo in Wikimedia) In Incubator
- Dombe (ndq)
- Dyula (dyu)
- Ga (gaa) In Incubator
- Hakha Chin (cnh) In Incubator
- Hiligaynon (hil) In Incubator
- Hunsrik (hrx) In Incubator
- Iban (iba) In Incubator
- Jingpo (kac)
- Kanuri (kr) In incubator with code knc
- Khasi (kha)
- Kiga (cgg)
- Kituba (ktu)
- Kokborok (trp)
- Krio (kri) In Incubator
- Luo (luo) In Incubator
- Makassar (mak)
- Mam (mam)
- Marshallese (mh) In Incubator
- Marwadi (mwr in Google, rwr in Wikimedia) In Incubator
- Mauritian Creole (mfe) In Incubator
- Mizo (lus) In Incubator
- Ndau (ndc)
- Nuer (nus) In Incubator
- Qʼeqchiʼ (kek)
- Seychellois Creole (crs)
- Southern Ndebele (nr) In Incubator
- Susu (sus)
- Tiv (tiv)
- Yucatec Maya (yua) In Incubator
- Zapotec (zap) In Incubator
D) Languages not to enable:
Santali (sat)Google translate uses Latin script, Santali Wikipedia uses Ol Chiki script.
Related: T308248: Newly supported languages in Google Translate