Page MenuHomePhabricator

Adjust the threshold for Chinese Wikipedia to prevent publishing when overall unmodified content is higher than 70%
Closed, ResolvedPublic

Description

Currently, there is a request (https://zh.wikipedia.org/wiki/Wikipedia:互助客栈/其他#請求禁止非自動確認用戶使用「內容翻譯」工具) for disabling ContentTranslation for non-Autoconfirmed completely due to:
1: CX cannot display a warning from the abusive filter. (This should’ve been fixed by T123912: Highlight the sections which causes abusefilter error. And has been fixed by Admin on Chinese Wikipedia)
2: Cannot stop new users to publish pure machine translation.

We know that disable CX for non-Autoconfirmed users completely (aka not allow non-Autoconfirmed users to use the tool at all) is not feasible since it requires a rewrite of BetaFeature extension to allow user group restriction.

Instead, we should address the second concern by adding zhwiki to wmgContentTranslationUnmodifiedMTThresholdForPublish for 70, similar to what was implemented in T228971: Adjust the threshold for Indonesian to prevent publishing when overall unmodified content is higher than 70%.

Event Timeline

Thanks for creating this ticket, @VulpesVulpes825.

We are open to adjust the limits as we did for Telugu (T244769) or Indonesian (T228971), but I'd like to understand the situation in Chinese Wikipedia a bit better.

Looking at the numbers, the deletion ratio in Chinese Wikipedia looks much lower for articles created with Content Translation, compared to articles created from scratch:

Recent months (Jan-Feb, 2020)Past Quarter (Oct-Dec, 2019)Last year (Jan-Dec, 2019)
Deletion ratio for articles created with Content Translation4.6%5.9%7.6%
Deletion ratios for articles created without Content Translation8.9%12%13%

Based on these numbers, an article created with Content Translation is about two times more likely to survive than an article created from scratch in Chinese Wikipedia.

In addition, the deletion ratios for Content Translation do not look very different from what we see across other languages. Looking at the analytics from last year (2019), the overall deletion ratio across all languages was 5% for articles created with Content Translation, and 11% for articles created without using the tool.

Given these numbers it is not clear to me why the issue of content quality is identified with Content Translation. Preventing users to use Content Translation may lead them to an alternative that results in higher deletion ratios.
However, numbers just tell us part of the story, and it would be very useful to better understand which are the issues the community is experiencing in more detail.

@Pginer-WMF, Thank you for your response.

I also check the statistic Content Translation provided within the extension when I create this ticket and do see the deletion rate of CX produced content is really low.

The discussion of disabling CX has reach a length that becomes messy, but what I can summarize based on the lengthy discussion is that some people want to disable CT because:
1: CX cannot display warning from abusive filter, and edits can slip through abuser filter by clicking submit again. This should no longer be an issue as Chinese Wikipedia admin has fixed abusive filter code from their end.
2: CX cannot stop new user to publish pure machine translation. Some people argue that provide MT by default is bad. This is the reason why I ask for a threshold change, as some others want ContentTranslationTargetNamespace, ContentTranslationPublishRequirements, wmgContentTranslationUnmodifiedMTThresholdForPublish to be set.
3: CX have countless bugs. I have argue in the discussion that if you find a bug, report to phab rather than complaining bugs in beta feature without file bug report.

Your statistics do prove my view that CX does a fine job a Chinese Wikipedia, and if you do not mind, I will translate your response in Chinese and post it in the discussion for more feedback. Thank you.

@Pginer-WMF Please reading this comment in https://zh.wikipedia.org/wiki/Wikipedia:互助客栈/其他#請求禁止非自動確認用戶使用「內容翻譯」工具

随便一个地方就能冒出一个奇怪的空格,括号后也好,句号后也罢,这还是那些比较注意质量的新用户的翻译,有些用户甚至没有管过那些奇怪的句中空格,直接发到条目空间里面,如果巡查员稍微不负责任,G13的概率很高。同样,我在一周前使用内容翻译的时候,不堪VE的困扰,想发到草稿里面换用源代码,再次出现被AF挡掉而不说明原因的问题,也就是说,Xiplus的修正,并没有完全起到效果,仍然会有被AF挡住而原因不明的问题。综上,我仍坚持禁止新手使用内容翻译工具。此举并非是歧视用户,而是方便新手,不让新手的努力因为不明AF或者G13而付诸东流,相比之下,手工翻译由于可以自由切换于VE和源代码之间,其事实上对新手熟悉源代码编辑更为有利,并且在User空间或者Draft空间下的源代码可供社群检视,帮助找出问题所在,同时,所有被AF挡掉的编辑都会留下详细的过滤器日志,而过滤器日志无疑是对新手给予帮助的最有效的信息。同时,我强烈{{抗议}}基金会的开发人员为了强行将内容翻译工具保留,将不具备可比性的两组数据用以比较,偷换概念,将内容翻译的删除数量与其他类型的删除来比,显然是在混淆视听。另外,我认为出现内容翻译工具所建条目删除量低的原因,并非其比手工翻译更为优越,而是大部分普通编者采用手工翻译,造成手工翻译条目本身就较多,而采用内容翻译工具的条目也因为不明AF或是宇帆所遇到的发布障碍,一部分并没有被发布,成功发布的,只是内容翻译条目的一小部分,这样这种奇怪的数据便能够被理解了。-- @DWYoungDLS

It mean

  1. Some error from Content Translation could't be understood or didn't show.
  2. Thought Content Translation can show error message from AbuseFilter, but it can't record what error was, we need this.
  3. Don't use numbers of delete, because deleteing articles had many reasons, not only CSD but also many reasons need delete articles.
  4. Most of editers translated by themselves.
  5. Some of editers translated by Content Translation,but it could't post because of either unknown obstacle or unknown error message from AbuseFilter.

Based on data provided.by Xiplus, in the last 120 days, articles deleted due to rough translation are 106. Disregard 9 of the articles that do not have creation record, the ratio of deletion between articles of created by CX or not is 29:68, or 30%:70%. In other words, on average 0.24 article per day is deleted due to rough translation in the past 120 days.

Based on data provided.by Xiplus, in the last 120 days, articles deleted due to rough translation are 106. Disregard 9 of the articles that do not have creation record, the ratio of deletion between articles of created by CX or not is 29:68, or 30%:70%. In other words, on average 0.24 article per day is deleted due to rough translation in the past 120 days.

But 0.24 mean :

  1. use Content Translation
  2. post successfully
  3. page which was deleteed in namespace of article and summary had G13

Poor translation was deleted not only G13 but also XFD.

@Sunny00217, G13, rough deletion, should be a good statistic for summarization of current situation. If you believe XFD also contains deletion of rough translation, which should be minimum consider Chinese Wikipedia has speedy deletion for rough translation, then please provide data to support your claim in order for us to pin point the issue, thank you.

Based on data provided.by Xiplus, in the last 120 days, articles deleted due to rough translation are 106. Disregard 9 of the articles that do not have creation record, the ratio of deletion between articles of created by CX or not is 29:68, or 30%:70%. In other words, on average 0.24 article per day is deleted due to rough translation in the past 120 days.

I'd consider that a deletion ratio of 24-30% of Content Translation articles seems high (even if the deletion ratio for new page creations is much higher). So it is worth looking in more detail. I tried to measure those numbers but I'm not getting the same results.

Our generic data query tool (Turnilo) shows 1100 articles created with Content translation in the last 120 days, 49 of them being deleted. That's a 4.5% deletion ratio. It is worth noting that Turnillo does not have yet the data for the last month (March 2020) since data is loaded periodically with a delay of one month. So the results below include data for the Nov 2019 - Feb 2020 (I'll check again once March data is available):

Screenshot 2020-04-02 at 10.28.55.png (358×1 px, 40 KB)

Given that March data was missing I took a look to the Content translation stats page, which takes data directly form Content Translation database. It does not show an unusual spike of deletions for March. For most weeks in March there are around or over 100 translations with less than 10 articles deleted.

Screenshot 2020-04-02 at 10.23.15.png (601×1 px, 55 KB)

So I wonder if the calculations of the 120 deleted articles are counting also some that are not created with Content Translation. Content translation articles have a "contenttranslation" edit tag (you can check this filtered view of recent changes to inspect the recent ones). Do you have more details on how the 106 deleted translations are calculated?

Also, replying to the specific issues below:

In T246383#6021281, @Sunny00217 wrote:
  1. Some error from Content Translation could't be understood or didn't show.

Wikipedia content and processes are complex. Although some errors may seem similar on the surface, they may have very different origins. It would be very helpful to get specific examples. Also, since we are not familiar with Chinese or particular Chinese Wikipedia policies, it is super useful if the explanation provides us enough context to understand which is the current result, which is the expected result, in which way it is different and why it is important such difference. Once we have a specific reproducible case, engineers can investigate and solve it.

  1. Thought Content Translation can show error message from AbuseFilter, but it can't record what error was, we need this.

I included below an example of an abuse filter error. There the abuse filter title is shown ("Youtube links") and an error card will provide more details:

af-cx2-testing.png (938×1 px, 70 KB)

Is that similar to what you see for Chinese? I'm not sure what do you mean by "it can't record what error was".

  1. Don't use numbers of delete, because deleteing articles had many reasons, not only CSD but also many reasons need delete articles.

Deletion numbers are just an approximation to article quality. It is far from perfect, but it is what we can evaluate since we are not Chinese native speakers. Getting feedback from the community is really important. We are using the numbers as a starting point for these conversations to try to understand what is happening in more detail.

  1. Most of editers translated by themselves.

Does it mean that they use Content translation, but they are not using machine translation; or they are not using Content Translation at all?

  1. Some of editers translated by Content Translation,but it could't post because of either unknown obstacle or unknown error message from AbuseFilter.

As mentioned above, we need to check how abuse filters are communicated currently. This is a complex area since filters are defined by each wiki and content Translation needs to define a generic mechanism that works well for most cases.

@Pginer-WMF, Thank you for your response.

The discussion of disabling CX has reach a length that becomes messy, but what I can summarize based on the lengthy discussion is that some people want to disable CT because:
1: CX cannot display warning from abusive filter, and edits can slip through abuser filter by clicking submit again. This should no longer be an issue as Chinese Wikipedia admin has fixed abusive filter code from their end.

As mentioned above, we need to check how abuse filters are communicated currently. Content translation uses the regular process to publish a page, so it should not be possible to skip an abuse filter. If an abuse filter prevents from publishing a given content, it won't be published with any tool (Content Translation, Visual editor or any other).

2: CX cannot stop new user to publish pure machine translation. Some people argue that provide MT by default is bad. This is the reason why I ask for a threshold change, as some others want ContentTranslationTargetNamespace, ContentTranslationPublishRequirements, wmgContentTranslationUnmodifiedMTThresholdForPublish to be set.

Content translation already has mechanisms to limit the publication of unmodified machine translation. In order to adjust them we need to know how frequently are translations deleted and which is the percentage of machine translaiton they published (so thanks for all the input so far, which is really helpful!). The translation debugger canbe useful to get the percentage of unmodified translation for a given translation.

3: CX have countless bugs. I have argue in the discussion that if you find a bug, report to phab rather than complaining bugs in beta feature without file bug report.

A mentioned above, the tool deals with different complexities (wiki content, differences across wikis, integration of external translation services) and it fails in some cases. Fortunately, we have automated tests, so once a specific problematic case is solved it won't cause problems again So I'd also encourage reporting specific cases that we can replicate.

Your statistics do prove my view that CX does a fine job a Chinese Wikipedia, and if you do not mind, I will translate your response in Chinese and post it in the discussion for more feedback. Thank you.

Sure. Feel free to share any comment or piece of data.

@Pginer-WMF

In T246383#6022068, @Pginer-WMF wrote:
......

  1. Thought Content Translation can show error message from AbuseFilter, but it can't record what error was, we need this.

I included below an example of an abuse filter error. There the abuse filter title is shown ("Youtube links") and an error card will provide more details:

af-cx2-testing.png (938×1 px, 70 KB)

......

We know it could show error,but we need errors record in Special:AbuseLog,and it didn't do it.

So I wonder if the calculations of the 120 deleted articles are counting also some that are not created with Content Translation. Content translation articles have a "contenttranslation" edit tag (you can check this filtered view of recent changes to inspect the recent ones). Do you have more details on how the 106 deleted translations are calculated?

Here is the code from Xiplus. It seems I paraphrased too much, the deletion rate is articles that are created by CX and deleted due to rough translation vs articles that are not created by CX and deleted due to rough translation.

but I'm not getting the same results.

Those 106 pages were deleted in last 3 months. But those pages probably not be created in last 3 months.

By a way,some pages from Content Translation may be translated again, so we can't count how many translation was translated good or bad and it's percentage.

In T246383#6022111, @Sunny00217 wrote:

@Pginer-WMF

In T246383#6022068, @Pginer-WMF wrote:
......

  1. Thought Content Translation can show error message from AbuseFilter, but it can't record what error was, we need this.

I included below an example of an abuse filter error. There the abuse filter title is shown ("Youtube links") and an error card will provide more details:
......

We know it could show error,but we need errors record in Special:AbuseLog,and it didn't do it.

Thanks for the details. We'll investigate this.

In T246383#6025493, @Sunny00217 wrote:

By a way,some pages from Content Translation may be translated again, so we can't count how many translation was translated good or bad and it's percentage.

In Turnilo (the tool I used in T246383#6021991) it is possible to focus on the translations that create a new page, which would avoid counting twice when a user publishes the same translation multiple times. In general, it is also possible that the initial translation is later significantly modified by editors, but that is part of the purpose of the tool (help with a new version that is a good starting point to help the topic to be better covered). As I mentioned, measuring the deletions does not tell the whole story, but I think it is still a relevant data point since it tells us that someone considered the content so problematic to prefer not having it.

I included below an example of an abuse filter error. There the abuse filter title is shown ("Youtube links") and an error card will provide more details:

af-cx2-testing.png (938×1 px, 70 KB)

Is that similar to what you see for Chinese? I'm not sure what do you mean by "it can't record what error was".

No warning message with Abusefilter description is above of translation. Only shown on the right (Without Abusefilter description). See https://commons.wikimedia.org/wiki/File:CX_AbuseFilter_disallowed_card_zhwiki.png.
Related task: T246215

@Pginer-WMF There are more error ( from https://zh.wikipedia.org/wiki/Wikipedia:互助客栈/技术#新BUG:“正在加载保存的翻译...”无限加载 ) :
There are at least two users encountered this bug that when they recently used Content Translation, they could not load the original saved translation normally.
Log : https://t.me/wikipedia_zh_help/26952

@Sunny00217, please open new tickets for different issues, thank you.

@Sunny00217, please open new tickets for different issues, thank you.

If we can change the setting ,It makes sense to open new tickets.

In T246383#6029837, @Sunny00217 wrote:

Yes. We are open to change the default to whichever option (Yandex, Google, LingoCloud, or Youdao) works the best for the community.

Yes. We are open to change the default to whichever option (Yandex, Google, LingoCloud, or Youdao) works the best for the community.

For current options, Google should be the best based on translation quality. DeepL, Baidu, Sougou, and Bing should also be considered for future addition of MT option, since sometimes these services produce better MT quality. It would be the best for it to be a new task and link that to MX MT parent tracking ticket, rather than here.

Change 592479 had a related patch set uploaded (by VulpesVulpes825; owner: Junyin Chen):
[operations/mediawiki-config@master] wmf-config/: Adjust MT threshold for Chinese Wikipedia to 70%

https://gerrit.wikimedia.org/r/592479

VulpesVulpes825 triaged this task as Low priority.

For reference, I updated the above table with the deletion ratios to include the March results that are available now:

Recent quarter (Jan-Mar, 2020)Previous Quarter (Oct-Dec, 2019)Last year (Jan-Dec, 2019)
Deletion ratio for articles created with Content Translation5.0%5.9%7.6%
Deletion ratios for articles created without Content Translation13%12%13%

Thanks for the patchset, @VulpesVulpes825. It is great to see this kind of contributions. However, deploying it now seems a bit sudden without the involvement of the team that maintains the affected tool.

We'll be discussing the adjustment of the thresholds on Thursday and assessing this particular change. As I mentioned earlier, the input from the community is key, and we are inclined to adjust the limits based on it, but we need some time to discuss and evaluate the situation.

VulpesVulpes825 changed the task status from Open to Stalled.Apr 28 2020, 7:48 AM

@Pginer-WMF No worries. The community has moved away from the plan to completely disable CX for the non-autoconfirmed users, which means we are already on a good track.

Change 592479 merged by jenkins-bot:
[operations/mediawiki-config@master] Adjust ContentTranslation MT threshold for Chinese Wikipedia to 70%

https://gerrit.wikimedia.org/r/592479

Mentioned in SAL (#wikimedia-operations) [2020-05-05T11:09:09Z] <kartik@deploy1001> Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit|592479|Adjust ContentTranslation MT threshold for Chinese WP to 70% (T246383)]] (duration: 01m 01s)

VulpesVulpes825 moved this task from Backlog to Closed on the Chinese-Sites board.

The change should be live now. Thanks everyone involved for their input.
Please, provide feedback on how the limits work after the adjustment. Finding the right balance may require more than one round of evaluating results and readjusting.

For the case of Chinese we want to pay attention on how the percentage of changes are computed and will be evaluating this in more detail in T251893. Please, feel free to provide input or report any issues you may find.

Thanks!

Based on community feedback, this change has been reverted (T252371) until the algorithm to measure content modifications is reviewed in T251893. The process for adjusting the limits will continue once the algorithm is adjusted.