Page MenuHomePhabricator

Define a protocol to adjust translation limits for specific wikis based on data
Closed, ResolvedPublic

Description

Across all languages, Wikipedia articles created with Content Translation are deleted less often than those created from scratch. However, that is not always the case for all languages.

We started to collect data about deletions (T286636) which provides an overview about the Wikipedias where translations are deleted more often than other articles. This can be a good indicator of issues with the tool on those particular languages to research further and/or adjust the current quality limits for those wikis to encourage users to review contents further before publishing.

This ticket proposes to define a criteria based on the above data on the kind of adjustments to do. This could look as something like this:
Make the limits for a wiki 10% more strict when the wiki:

  • Appears in the list of wikis with high deletion more than once in the past 4 quarters.
  • Deletion rate difference is more than 5%
  • CX Deletion rate is over 10%
  • Number of CX articles is over 50 for a quarter.

The above is just an example to try to focus on the cases where the issues may be happening more consistently and not due to data noise (e.g., small number of articles resulting in a high deletion percentage) or exceptional events (e.g. one user vandalizing)

Task completed
The criteria for adjusting Machine Translation limits in Content Translation based on data provided is documented here.

Event Timeline

Pginer-WMF triaged this task as Medium priority.Feb 18 2022, 4:48 PM

I'm unclear on why this task is on our board. Please refer to https://phabricator.wikimedia.org/T317229 for further information about filing tasks that request CRS support.

@UOzurumba Please do something about my comment.

For context: The idea of this task was to define systematic steps that could guide on how to react in a consistent way to the increase of deleted translations that may happen on a given wiki (e.g., when to investigate with the community, when to adjust the limits, etc.). This is completed and applied already in some cases (e.g. T319156 ).

What is missing is documenting it publicly on wiki as a section at the end of this page. We can keep the ticket open fur this to complete or create a separate one for documentation as it better fits the Community Relation Specialist team workflows.

@Pginer-WMF thank you for replying

@UOzurumba Please do something about my comment.

@Elitre Sorry for the late reply. As Pau has explained, I worked on developing different criteria to adjust the MT limits based on available data proactively. I am to document it on Wiki; that is why the task is on our board and still open.