Page MenuHomePhabricator

ContentTranslation does not translate accented words correctly
Closed, InvalidPublic

Description

The tool is always messing up translations containing accented letters. The most recent example is this:

  • Expected result: 'polynomial' -> 'polinômio'
  • Actual result: 'polynomial' -> 'polinˆ omio'

This happened using Firefox 50, on https://pt.wikipedia.org/wiki/Special:ContentTranslation?page=Rational+root+theorem&from=en&to=pt&targettitle=Teorema+das+ra%C3%ADzes+racionais

Event Timeline

He7d3r created this task.Dec 11 2016, 12:53 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptDec 11 2016, 12:53 PM

Which machine translation provider does this?

Yandex.Translate

The web interface of Yandex API seems to be correct:

Arrbee added a subscriber: Arrbee.Jan 15 2018, 11:26 AM

Please note, this may be due to the Yandex translation API version that is used in Content Translation. We are currently using the officially released version of the API, while Yandex's web interface uses an improved technology that is being beta-tested. This could be the reason why the correct translation is shown on the web interface. Thanks.

I used the sentence from the source article mentioned in the bug report - Rational root theorem, and confirmed that wrong result is from Yandex API

So, this is not a bug resulting from CX codebase but from Yandex MT api. Perhaps this fragment having a wrong translation in the MT training data in Yandex.

Petar.petkovic closed this task as Invalid.Mar 4 2018, 1:08 AM
Petar.petkovic moved this task from Bugs to MT on the ContentTranslation board.