Please refer to https://www.mediawiki.org/wiki/Product_Analytics#How_to_get_help_with_data_or_analysis for help answering these prompts>>! In T356765#9537247, @Pginer-WMF wrote:
>>>! In T356765#9517683, @mpopov wrote:
>> @Pginer-WMF: Does this question relate to any of the hypothesis work your team is doing? If so, can you please share with hypothesis?
>
> This is connected with the work the Language team is doing around MinT. The current hypothesis is "Scaling Open Translation service will increase page interactions from underserved communities" and it is part of the Key Result "WE2.2 Interested readers will discover and browse more content".
>
> As machine translation is exposed to more users with an option to contribute we want to understand better the factors that affect the quality of the content created. Based on the experience of Content Translation we have identified different potential issues based on community reports and anecdotal evidence that data can help to clarify.
**What team/program is this request for?**
Language Team.
**What are you requesting?**
We want to better understand which are common traits present in low-quality translations where machine translation is used.
- What is the deletion rate by various levels of user experience?We want to analyze common factors that have been associated with low quality tranlations:
- What is the impact of publishing too many translations within a certain period on the deletion rate?Translations over a short period of time. This is commonly associated with campaigns/contests where some users may be incentivized to create a large number of articles without enough emphasis on quality.
- User expertise level. Communities have requested to limit access to Content Translation or Machine Translation to be accessed only by experienced users. - dimensThis comes with the assumption that problematic translations: user are mainly produced by the less experience,ed users. target language,An assumption we want to check and put in perspective.
- Length of the content. length of the translationHow long is the translation (in itself or with respect to the original article) is another factor that may signal low quality translations.
For measuring translation quality, we have used the article deletions as a proxy but additional signals can be considered too.
**What is the problem you're trying to solve?**
Understanding better when machine translation is misused for content creation helps us to adjust the prevention mechanisms to encourage good use of it.
**What decision will you make or action will you take with the deliverable?**
**Additional details**
We plan to improve the translation limits system (T251887) and this analysis can be useful to (a) identify how to adjust the limits, and (b) set a baseline to identify improvements produced by the new limits.
In addition, as MinT is exposed to Wikipedia readers, options are provided to them to enter the editing path (contribute improved translation). This means that the translation activity will be exposed to a broader less experienced audience which may require additional guidance. Knowing the factors that affect translation quality will be useful to define the best approach to guide/encourage/discourage newcomers to translate in a certain context.