Page MenuHomePhabricator

Check translation quality for EN-to-{ES,CA} using Apertium
Closed, ResolvedPublic

Description

Apertium provides automatic translation from English to both Spanish and Catalan. We need feedback from users on whether the text provided is useful to use as a base for translating an article.

We can get this feedback from testing sessions or self-reported data.

Event Timeline

Pginer-WMF claimed this task.
Pginer-WMF raised the priority of this task from to Needs Triage.
Pginer-WMF updated the task description. (Show Details)
Pginer-WMF changed Security from none to None.
Pginer-WMF subscribed.
Arrbee triaged this task as High priority.Dec 10 2014, 2:37 PM
Arrbee edited projects, added LE-Sprint-80; removed LE-Sprint-79.
Arrbee subscribed.

Did a translation test from Enwiki to cawiki with new version of the tool as requested by Pginer. Results http://cx.wmflabs.org/index.php/User:Kippelboy/Corina_Cre%C8%9Bu and final results https://ca.wikipedia.org/wiki/Corina_Cre%C8%9Bu

Grammar comments: Gender & Number problems from EN to Catalan -> identifying Woman as man and hard to translate determinate articles ("un,el"...) from English undeterminate "the"
Tech comments: Tool got problems adding to many <ref /> tags and some of them with a blank space in the middle. It also misses adding the "<references /> tag

Tool works quite OK but these language pairs are more difficult to match than from Spanish to Catalan.

Pginer-WMF closed this task as Resolved.EditedDec 17 2014, 8:31 AM

From initial comments from editors it seems that the machine translation (MT) quality provided for these languages is still useful. I think it makes sense to provide automatic translation for them (especially since it can be disabled/discarded on a per user basis).

There have been some articles created as part of the process in Catalan and Spanish Wikipedias.

We may want to pay attention to the articles produced in order to identify the impact of a lower MT quality for those languages during the beta stage: will that result in low quality articles (more reverts)?, good but shorter articles (users willing to translate only a fe paragraphs)? or in less productive editors compared to other language pairs (due to the additional effort to correct MT)?