Page MenuHomePhabricator

Make more strict the check for unmodified content for the whole document on Indonesian Wikipedia
Closed, ResolvedPublic


Currently, an error preventing publishing is shown when the amount of unmodified content is 99% or higher (T190283). This was targeted to prevent the most clear cases of vandalism. Given that low-quality translations seem to proliferate in Indonesian Wikipedia (T219851) we want to adjust this threshold to be more strict.

On Indonesian Wikipedia (and only there), the threshold will be adjusted to prevent the publication of translations with an overall amount of >30% of unmodified contents.

Based on data and feedback we'll evaluate whether the adjusted threshold improves the situation significantly in conjunction with other adjustments proposed (T221359), or it needs further adjustment. We need to keep in mind the potential for false positives, since elements such as proper nouns, templates, short section titles, and references are often legitimate unmodified content that is ok for users to publish (so we may need to keep some 5-10% of margin of error).

Event Timeline

The Indonesian community has decided to remove machine translation, this is the will of the community:

This comment was removed by Mimihitam.

Change 505220 had a related patch set uploaded (by Petar.petkovic; owner: Petar.petkovic):
[operations/mediawiki-config@master] Use higher unmodified MT threshold for Indonesian Wikipedia

Change 505220 merged by KartikMistry:
[operations/mediawiki-config@master] Use higher unmodified MT threshold for Indonesian Wikipedia

Mentioned in SAL (#wikimedia-operations) [2019-04-23T11:18:48Z] <kartik@deploy1001> Synchronized wmf-config: SWAT: [[gerrit:505220]] Use higher unmodified MT threshold for Indonesian Wikipedia (T221353) (duration: 00m 57s)

This is working now. One related piece that was still pending is showing the additional details to better explain the situation (T203377), which can be a good follow-up.

This does not work at all. I only changed one or two words, and it still got published. Compare:


As a note, a good translator will never translate "Brown's party of 22" into "Partai Brown 22", because in English, that actually means "Brown 22 political party"!! This clearly demonstrates how horrible machine translation is.

Since this option has failed, please disable machine translation entirely as was agreed by the community in the first place.

Thank you.