Page MenuHomePhabricator

Simplify the system of limits to make it more predictable
Open, MediumPublic

Description

In Content Translation, a system of limits encourage users to review the initial translations.

The current system makes a decision (prevent publishing, warn or add to a tracking category) based on several factors including the total percentage of unmodified contents in the translation, the number of problematic paragraphs, whether those were marked as reviewed and whether the user had translations deleted in the previous month.

Making the decisions based on the number of paragraphs seems to introduce some problems. For example, longer articles are more likely to include content that can generate false positives such as math formulas (T245827). However, short articles can have a higher percentage of unedited machine translation.

This ticket proposes to simplify the system of limits so that the decision is made based on the overall percentage of unmodified content, with adjustments to make such global limit more or less strict.

Proposed approach

The limits are based on two parameters that can be adjusted for each community (ideally exposed through community configuration for them to adjust):

  • Limit (L): 95% by default, indicates how much unedited machine translation is allowed to be published.
  • Flexibility (F): 10% by default, indicates the percentage points that the limit can be adjusted to make it more strict/relaxed when needed.

The limit system rules are based on the overall percentage of unedited machine translation (MT):

  • If MT > L: Publishing is prevented. By default, this means users can publish translations with less than 95% of unedited machine translation.
  • If MT> L - F: A warning is shown and the published translation is tagged for the community to review. By default, this means users trying to publish a translation with 85% to 95% of unedited machine translation will be able to publish their translation but will get a warning to review it.
    • If the user has a translation deleted in the last 30 days, publishing is prevented. With the default values, this means that a user with a previously deleted translation will only be able to publish translations with less than 85% of MT during a month (after it, the usual 95% limit will apply). Given that historically problematic events such as contests have a duration that does not tend to exceed a month, making the limit strict

For paragraphs
The limits are calculated for the whole translation. Information at a paragraph level is used for user guidance and incorporating their feedback.

  • For each paragraph warnings are shown when the unedited machine translation is considered high as a guidance for the user about which paragraphs they may need to edit to improve the overall percentage for the article.
  • Paragraph warnings include an option to indicate the user already reviewed the translation, which allow to compensate for cases where the machine translation is very good. In such cases an additional percentage will be considered as modified.

So, for each paragraph:

  • If MT> L - F: A warning is shown. A warning is shown to indicate that the paragraph contains too much unedited machine translation. By default, when the paragraph contains over 85% of unedited machine translation.
    • If the user marks the warning as resolved, the paragraph MT will be computed as (MT - ½F). This only applies to users without a deleted translation during the last 30 days. For example, a user marking a paragraph with 90% of unedited machine translation as resolved, the paragraph will be considered as having 85% of unedited machine translation instead (90% - ½*10%).

This system is expected to be easier to communicate (we can show which is the percentage to reach to be able to publish, and the paragraphs that can be edited), account for false positives (letting users confirm when translation is good) and preventing abuse (catching one problematic translation will require additional editing for the user during a limited period of time). Combined with limiting the pace of article creations (T331023), this approach can also help with low quality spikes that have resulted in some cases due to contests and events.

Event Timeline

From our past observations, especiailly during translaiton campaigns, many users participate, potentially creating low quality articles. The review happens much later. Reviwers also had complained that they cannot review all these articles on time. When review happens, articles get deleted. So the deletion happens weeks later the translation activity. Considering this, the chances that a new user has a deleted translation while making intentional or unintentaionl low quality translation is rare.
Hence, the proposed strict limit if user has deletion in last 30 days might not have expected effect. However, I support keeping this in place. But the user should be clearly communicated why their translation limits are high.

In other words, when personalized limits are in place, the person need to know the limits and rational for transparency and to avoid spending time and later wondering what is happening.

This ticket is about MT modificaiton calculation, but for the record, let me also propose limitations based on usage patterns.

  1. We can discuss the rate limiting of translations per hour(or such duration) to filter out translating for meeting some external goals(such as top translator in a campaign) while having low quality translations. For example, the proposed F value get reduced by 1 unit for every new translation with in that time frame?
  2. Or even preventing from starting a translation? There are several aspects to consider for such limitation, without discouraging power users of tool who make excellent translations. May be the reverse of it - rewarding people with extra F value when they published good quality articles in past?
  3. Another idea is calculating the total time take for start_translation to publish_translation in relation with the content being published.

From our past observations, especiailly during translaiton campaigns, many users participate, potentially creating low quality articles....

Good points, Santhosh.

I agree that the review happening later, reduces the effectiveness of making the limit more strict for deletions. The benefit is that each instance the reviewers catch will potentially save them from reviewing an additional set of quick translations by the same user that would be created otherwise. In any case, I think that other measures are needed to support the scenario of contests/campaigns (including the increase of visibility for reviewers on those so that they can catch issues earlier to prevent more of them).

In particular, rate limiting seems a great idea. Based on earlier conversations we had on this, there is already a proposal along the lines of approach #2 above: T331023: Limit the publication of fast unreviewed translations