Page MenuHomePhabricator

MinT injects a letter not present in destination alphabet (en->fa)
Open, LowPublicBUG REPORT

Description

Steps to replicate the issue (include links if applicable):

What happens?:
The suggested text by Mint for Wikimedia (fa:ویکی‌مدیا) is produced as (ویکیمیڈیا) in the suggestion. The latter has the following issues:

  • The letter ڈ is not present in Persian/Farsi alphabet.

What should have happened instead?:

  • The correct translation would be ویکی‌مدیا which has a ZWNJ at fourth index, removed 6th index and converted 7th index from ڈ to د (See screenshot)

Software version (on Special:Version page; skip for WMF-hosted wikis like Wikipedia):

Other information (browser name/version, screenshots, etc.):

image.png (634×1 px, 50 KB)

Event Timeline

Nikerabbit moved this task from Backlog to General translation functionality on the MinT board.
Nikerabbit subscribed.

MinT uses nllb200-600M for fa (more specifically pes_Arab (see https://gerrit.wikimedia.org/g/mediawiki/services/machinetranslation/+/906100d449d5c4662c8efe1ee5c386e862be8bea/translator/models/languages.py#55).

If this is a problem with the model, there isn't much we can do it. Based on your report it doesn't seem this could be fixed with simple post-processing.