[The Opus project](https://opus.nlpl.eu/) provides [translation models for many languages](https://opus.nlpl.eu/Opus-MT/). This task identifies languages not supported by other translation services, not even those potentially supported by NLLB-200 (T326578). In addition, it also includes languages for which feedback suggests that Opus could significantly improve the translation quality compared to other options available.
This ticket proposes to provide support for the following languages (and specific pairs):
| Language | Pair | BLEU | Status | Notes
| -------- | ---- | ---- | ------ | -----
| Central Bikol (bcl) | [[ https://github.com/Helsinki-NLP/OPUS-MT-train/tree/master/models/en-bcl | en – bcl ]] | `31.9` | ✅ | Enabled as part of {T331836}.
| | tl - bcl | | | !!Model not available!!
| Cantonese (zh-yue) | zh – zh-yue | | | NLLB-200 support is not valid for the language based on {T333835}. We may want to check whether Opus support is useful.
| | en – zh-yue | | |
| Moroccan arabic (ary) | en – ary | | | NLLB-200 support is not valid for the language based on {T339926}. We may want to check whether Opus support is useful.
| | ar – ary | | |
| Gun (guw) | [[ https://github.com/Helsinki-NLP/OPUS-MT-train/tree/master/models/en-guw | en – guw ]] | `45.7` | Deployment pending |
| Cherokee (chr) | [[ https://github.com/Helsinki-NLP/OPUS-MT-train/tree/master/models/en-chr | en – chr ]] | `44.6` | ✅ |
| Sranan Tongo (srn) | [[ https://github.com/Helsinki-NLP/OPUS-MT-train/tree/master/models/en-srn | en – srn ]] | `34.6` | |
| Venda (ve) | [[ https://github.com/Helsinki-NLP/OPUS-MT-train/tree/master/models/en-ve | en – ve ]] | `40.5` | |
| Tahitian (ty) | [[ https://github.com/Helsinki-NLP/OPUS-MT-train/tree/master/models/en-ty | en – ty ]] | `46.8` | Deployment pending |
| | [[ https://github.com/Helsinki-NLP/OPUS-MT-train/tree/master/models/fr-ty | fr – ty ]] | `39.6` | |
| Bislama (bi) | [[ https://github.com/Helsinki-NLP/OPUS-MT-train/tree/master/models/en-bi | en – bi ]] | `37.1` | |
| | th – bi | | | !!Model not available!!
| Tongan (to) | [[ https://github.com/Helsinki-NLP/OPUS-MT-train/tree/master/models/en-to | en – to ]] | `59.1` | ✅ |
| Manx (gv) | pt – gv | | | !!Model not available!!
| | [[ https://github.com/Helsinki-NLP/OPUS-MT-train/tree/master/models/en-gv | en – gv ]] | `70.1` | Next Candidate |
| Walloon (wa) | [[ https://github.com/Helsinki-NLP/OPUS-MT-train/tree/master/models/en-wa | en - wa ]] | `33.4` | |
| | fr - wa | | | !!Model not available!!
| Western Frisian (fy)| nl - fy | | | !!Model not available!!
| | en - fy | | | !!Model not available!!
| Breton (br) | [[ https://github.com/Helsinki-NLP/OPUS-MT-train/tree/master/models/en-br | en - br ]] | `88.5` | | Sentence segmenter using old file format; Not possible to use this model as of now.
| | fr - br | | | !!Model not available!!
| Finnish (fi) | [[ https://github.com/Helsinki-NLP/OPUS-MT-train/tree/master/models/en-fi | en - fi ]] | `25.7` | | Supported by other services already, but [feedback suggests](https://fi.wikipedia.org/wiki/Wikipedia:Kahvihuone_(uutiset)#WMF_Language_team's_reply) Opus may improve quality). Low BLEU score.
| | [[ https://github.com/Helsinki-NLP/OPUS-MT-train/tree/master/models/sv-fi | sv - fi ]] | `45.2` | |