Page MenuHomePhabricator

MinT translates en dash to ??
Closed, ResolvedPublicBUG REPORT

Description

When translating a text with en dash (–), the output is always ??.

This happens when translating from English to Spanish, Hebrew, Kashmiri. It doesn't happen when translating to Faroese.

You can test it with a simple text: "The opening hours are 08:00 – 17:30." Or translate the first section of Shah Jahan.

In many Wikipedia articles, the en dash is used for ranges of years, hours, etc.

Event Timeline

Change 925715 had a related patch set uploaded (by Santhosh; author: Santhosh):

[mediawiki/services/machinetranslation@master] Add a normalizer module

https://gerrit.wikimedia.org/r/925715

Change 925715 merged by jenkins-bot:

[mediawiki/services/machinetranslation@master] Add a normalizer module

https://gerrit.wikimedia.org/r/925715

Change 927160 had a related patch set uploaded (by KartikMistry; author: KartikMistry):

[operations/deployment-charts@master] Update MinT to 2023-06-05-111431-production

https://gerrit.wikimedia.org/r/927160

Change 927160 merged by jenkins-bot:

[operations/deployment-charts@master] Update MinT to 2023-06-06-120533-production

https://gerrit.wikimedia.org/r/927160

Test status: QA PASS

MinT is able to translate "-" correctly