Page MenuHomePhabricator

Disable Apertium translations for Spanish
Closed, DuplicatePublic

Event Timeline

Ignoring it as it was suggested elsewhere is not an option. The system provides very bad translations, unusable in the 99% of the cases. If someone starts to "translate" using this feature we'll spend more time fixing the translations than doing our work. Thank you.

Unedited machine translation is not allowed in translatewiki.net.

Unedited machine translation is not allowed in translatewiki.net.

Vandalism is also not allowed on wiki projects, yet it happens.

Please consider disabling the service. It is not helpful.

I think to Apertium should be disabled completly. I can confirm this issue for Bosnian/Croatian/Serbian.

Quality of translations varies by the language pair. Saying "quality of translation to X" is not useful without knowing also "from Y". English is a rarely supported source language in Apertium, so instead of doing [source text (English)]->[machine translation in target language], it often does [source text (English)]->[human translation to a supported source language in Apertium]->[machine translation in target language]. The integration in Translate doesn't show the source language, which I consider as an issue.

The opinions about the usefulness of the suggestions vary a lot. We have seen this with Content Translation as well, where different people might praise and complain about the translation quality at the same time. Arguably, for shorter texts MT is less useful because the translations depend much more on unwritten context than when translating prose.

In Translate we do not track whether (and which) suggestions are used. People satisfied with it (or just ignoring it) don't usually report, so we have this lopsided perspective where we have a few complaints against an unknown amount of satisfied people. I don't consider that as sufficient for disabling the whole service for everyone.

You bring up the possibility of "vandalism" using machine translation. It is a valid point, but I don't consider it sufficient reason for disabling the whole service for everyone. These things can be mitigated, like we have done in Content Translation. Our current mitigations are peer review and ability to mass fuzzy suspected bad translations. So far this has been sufficient though far from what I could consider as pleasant way of handling these.

Thank you for your comments. I may agree with "for shorter texts MT is less useful because the translations depend much more on unwritten context than when translating prose" as you say. However this is a reason why then this MT service is not suitable for our work as translators on betawiki. We translate short (sometimes not so short) strings of text with context sometime being provided by the qqq documentation, sometimes not. I have been browsing the Support page on betawiki and I've found similar complaints with regards to MT services, and all of them do concur in that the quality is very bad, and that doesn't help the translator. There are few cases in which the translation was accurate (mostly very common words with one single meaning such as "done" or "ready"). At this point, I think having a online dictionary for words we don't know might be far better perhaps.

As a former CX user myself, I must also say en-es translations of texts via Apertium were not good either, even when the text had much more context. As a translation administrator on Meta-Wiki, I often patrol Spanish translations and I have had to delete always those were it is evident that the "translator" used the Apertium service because they make no sense (and pages at Meta are longer than translatewiki strings, and have some context).

I think nobody here asks for a perfect MT system. That's probably impossible (of it it exists, it'd certainly be very expensive). However if the MT system provided to help volunteers is actually not helping us, then I think we should investigate alternatives. Maybe Apertium could be enhaced/polished?

I cannot speak for other languages, but as far as my work is concerned, Apertium is not being helpful in EN-ES translations. Maybe we could launch a survey at translatewiki in Support or elsewhere where all translators could comment.

I must also say that if this MT service isn't working well with two of the most spoken languages in the world then that's also concerning on its own IMHO.

Thank you.

Maybe Apertium could be enhaced/polished?

It is an open source project and they welcome help. Contributing to it, however, in my experience has a quite steep learning curve.

Maybe we could launch a survey at translatewiki in Support or elsewhere where all translators could comment.

Better understanding of the usefulness of the MT service(s) would be useful, but I don't have time to do the research.

I must also say that if this MT service isn't working well with two of the most spoken languages in the world then that's also concerning on its own IMHO.

It helps if you understand the project's history to see where they are coming from, e.g. https://en.wikipedia.org/wiki/Apertium#History

It was originally designed to translate between closely related languages, although it has recently been expanded to treat more divergent language pairs.

Well, I think that is the problem here. English is the source for MediaWiki messages, and Spanish is not closely related to it of course.

Thanks again for your consideration.

It's the same in Bulgarian. 99.99% of the translations done by this tool are wrong, it mixes words and even grammar between different literary slavic languages. Unfortunately people, that are not native Bulgarian speakers used it for 3000+ translations, which had to be fixed manually, and which is still ongoing process. I asked if this tool can be disabled 3 years ago, but the recommendation was just to ignore it. Now I have to spend tens of hours to fix all those wrong translations that are already visible at the Wikimedia projects.

It's the same in Bulgarian. 99.99% of the translations done by this tool are wrong, it mixes words and even grammar between different literary slavic languages. Unfortunately people, that are not native Bulgarian speakers used it for 3000+ translations, which had to be fixed manually, and which is still ongoing process. I asked if this tool can be disabled 3 years ago, but the recommendation was just to ignore it. Now I have to spend tens of hours to fix all those wrong translations that are already visible at the Wikimedia projects.

Thanks for the feedback, @StanProg. From your comment it seems you may be referring to Wikipedia article translations (which is not in the scope of this ticket). For articles created with content translation, the tool has a system to encourage users to review the initial translations and limits can be adjusted as per the community needs. For the current year, our data shows 372 articles created with Content translation in Bulgarian WIkipedia, and 50 of them (13%) were deleted. We are open to adjust the settings for those numbers to improve.

Thanks for the feedback, @StanProg. From your comment it seems you may be referring to Wikipedia article translations (which is not in the scope of this ticket). For articles created with content translation, the tool has a system to encourage users to review the initial translations and limits can be adjusted as per the community needs. For the current year, our data shows 372 articles created with Content translation in Bulgarian WIkipedia, and 50 of them (13%) were deleted. We are open to adjust the settings for those numbers to improve.

I'm not referring the Wikipedia article translations. I'm referring the Apertium translations for translatewiki i.e. the Suggestions by "Apertium WMF" on messages editing page.