Page MenuHomePhabricator

Learn from user corrections to avoid editing the same term again and again
Open, NormalPublic

Description

When translating, users have to correct some terms that are not properly translated by the Machine Translation (MT) service. For example, when translating John Carpenter article, the director's surname can be translated into whichever term is used in the local language for the "carpenter" profession. Since an article is about a specific topic there are chances that those mistakes need to be fixed by our users again and again.

From our user testing sessions we have observed that while fixing it the first time is reasonable, users were negatively surprised that the system didn't learnt the lesson for the next time.

While improving MT services is probably out of the scope for the project, it may be worth it to think in ways CX can save the user time in that process of correction. Some of these mechanisms can be also useful when there is no MT at all acting like a very basic (maybe at word level) MT-like system based on what you have already translated.

Proposed solution

  • Keep track of user corrections on MT that happens repeatedly. We need to decide how many times, how many words and how long they should be to consider them a correction.
  • Replace previous corrections when a paragraph is added if the corrected word is found.
  • Provide a way for users to switch among the alternatives (which include the MT proposed term and the one used in previous corrections).
  • Learn from the use of the alternatives to decide whether to apply corrections automatically or just suggest them.

We'll illustrate the idea with the example of translating the Los Angeles article from Spanish to English. Since "angeles" means "angels" in Spanish, we'll assume that the MT service is going to translate the name of the city too literally, and the user corrects those in the first paragraph:

Initial translation with errorsUser corrects the first paragraph

When adding the second paragraph (where the name of the city appears again), the system will replace it automatically from the proposed text by the MT based on the fact that the user has corrected it in previous paragraphs. In addition, the replaced text will be highlighted to communicate the user that a correction was applied automatically. This allows the user to undo the automatic change.

Whether to apply the correction automatically or let the user do it will depend on previous decisions of the user for that word. If the user undoes an alternative that was applied on a paragraph, we can consider not using the replacement word in following paragraphs, and just highlight the MT version to let the user know they can pick an alternative manually. In the example below, the correction is not applied automatically, only suggested for the user to apply:


A first step in this direction is providing alternative for link labels based on the target article title (T197662).

Related Objects

Event Timeline

Pginer-WMF updated the task description. (Show Details)
Pginer-WMF raised the priority of this task from to Needs Triage.
Pginer-WMF added a subscriber: Pginer-WMF.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 15 2015, 5:18 PM
Amire80 added a subscriber: Amire80.May 4 2015, 4:50 PM

This is something that I'd really love to have in some way, as a collaboration with dictionary and MT builders.

Somewhat related issues: T91748, T95886, T92243.

Amire80 triaged this task as Low priority.May 4 2015, 4:51 PM
Amire80 moved this task from Needs Triage to Bugs on the ContentTranslation board.Jul 2 2015, 5:08 PM
Amire80 raised the priority of this task from Low to Normal.
Amire80 set Security to None.
Amire80 lowered the priority of this task from Normal to Low.Oct 15 2015, 10:14 AM

A relate case of quick correction for what MT proposes is link translation (T145009). When automatic translation fails for links, the linked article title may be the intended information (or a good-enough approximation). Suggesting that as a quick correction can be helpful.

The issue of repeatedly fixing "errors, typos and mistakes" of MT was mentioned in this comment.

Framawiki added a subscriber: Framawiki.
Pginer-WMF updated the task description. (Show Details)Jun 27 2017, 8:15 AM
Pginer-WMF claimed this task.
Pginer-WMF updated the task description. (Show Details)Jul 4 2017, 11:57 AM

I added mockups and illustrated an example to show how the feature could work.

Pginer-WMF raised the priority of this task from Low to Normal.
Pginer-WMF removed Pginer-WMF as the assignee of this task.
Pginer-WMF removed a subscriber: Pginer-WMF.
Pginer-WMF added a subscriber: Pginer-WMF.

Would the proposed solution work with accented letters (e.g. the case listed at T152905, which is very common in math articles)?

Would the proposed solution work with accented letters (e.g. the case listed at T152905, which is very common in math articles)?

The initial idea is to consider as a correction any modification made to the text, which should work with accented letters and other symbols. However we may want to consider certain thresholds as we start working in this ticket. For example, if the users changes one character from lowercase to uppercase is this a change a correction we want to apply automatically the next time or is that correction likely to fail when applied in the next gramatical context?
In any case, since te approach allows to easily undo the changes, I think it should be ok to start with a basic approach and learn from the different situations we observe int he different languages.

Pginer-WMF updated the task description. (Show Details)Jun 19 2018, 10:36 AM
Arrbee moved this task from Bugs to Enhancements on the ContentTranslation board.Jun 22 2018, 1:41 PM
Arrbee moved this task from Bugs to Enhancements on the ContentTranslation board.
Pginer-WMF updated the task description. (Show Details)Jul 16 2018, 10:16 AM