We want to favour the creation of good translations and prevent “raw or lightly edited machine translation” to leak into Wikipedia. The current systems to warn, track and prevent publication of translations may require some adjustments to both avoid false positives and be more strict with those cases where the translations are more likely to be problematic.
In order to support this, this ticket proposes to consider two factors:
- The number of problematic paragraphs. That is, the number of paragraphs for which the unmodified content exceeds the current thresholds. If there are a significant number of problematic paragraphs, we may want to prevent from publishing the translation.
- Deletions of previous translations by the user. If any of the previous translations published by the user in the main namespace during the last month were deleted, we can apply more strict limits to make sure that content is properly reviewed.
Proposed approach
The proposed adjustment will be as follows:
For a regular user:
- With 0 - 9 problematic section: Allow publish and do not add to the tracking category (to reduce false positives as described in T217653).
- With 10 - 49 problematic sections: Allow publishing but add to the tracking category.
- With 50 or more problematic sections: Prevent publishing.
For a user with previous deleted translations:
- With 1 - 9 problematic sections: Allow publishing but add to the tracking category.
- With 10 or more sections: Prevent publishing.
Design details
For the cases where publishing is prevented, we want to show an error message. The error message is the same used for the unmodified threshold for the whole document (T190283), but using the "Your translation contains significant portions of unmodified text":
In addition, we want users to easily see where is the content they need to fix. When the error is shown, paragraphs will make visible the warnings related to too much unmodified content (T190279) for the problematic paragraphs, even if they were "marked as resolved".