Page MenuHomePhabricator

Translate extension: Add language as an abuse filter variable
Closed, ResolvedPublic

Description

Currently, translate_source_text is a variable that provides the source text being translated. A new variable, translate_target_language, should be created to provide the language that is being translated into.

Use case:
Filters can be used to detect if a translation is oddly short or long based on the size relative to the source text length[1]. However, this is not ideal when the target language is, for example, Chinese or Korean - the lengths should be quite different, because they are fundamentally different in structure.

Implementation:
In TranslateHooks::onAbuseFilterFilterAction, ::onAbuseFilterComputeVariable, and onAbuseFilterBuilder, register a new variable that is computed using $handle->getCode()

[1] Example: https://www.mediawiki.org/wiki/Special:AbuseFilter/76

Event Timeline

Restricted Application added a project: User-DannyS712. · View Herald TranscriptOct 25 2019, 1:14 AM
Restricted Application added subscribers: revi, Aklapper. · View Herald Transcript
DannyS712 added subscribers: Daimona, Nikerabbit.

Pinging @Nikerabbit @Daimona for feedback

Only target language? Don't you also need source language for consistency? Length in terms of strlen or mb_strlen?

Only target language? Don't you also need source language for consistency? Length in terms of strlen or mb_strlen?

The MessageHandle class doesn't seem to have a function for the source language, and on WMF wikis english is almost always the source language - a future task could be to provide source lang

Target only seems fine for now.

Length in terms of strlen or mb_strlen?

AF can already compute the length (in terms of mb_strlen), no need to pass it.

In TranslateHooks::onAbuseFilterFilterAction [...]

I'd suggest the onGenerateTitleVars hook if possible, as it would me more suited.

The MessageHandle class doesn't seem to have a function for the source language, and on WMF wikis english is almost always the source language - a future task could be to provide source lang

$handle->isValid() && $handle->getGroup()->getSourceLanguage(), though perhaps better not to assume that groups will have only a single source language in the future.

[snip]
I'd suggest the onGenerateTitleVars hook if possible, as it would me more suited.

I was going based on the current implementation of translate_source_text in https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/extensions/Translate/+/refs/heads/master/TranslateHooks.php, though using onGenerateTitleVars might be better (and, maybe?, help with T236193: translate_source_text is not set when examining recentchanges)

Change 548916 had a related patch set uploaded (by DannyS712; owner: DannyS712):
[mediawiki/extensions/Translate@master] Add translate_target_language variable for abuse filters

https://gerrit.wikimedia.org/r/548916

Change 548916 merged by jenkins-bot:
[mediawiki/extensions/Translate@master] Add translate_target_language variable for abuse filters

https://gerrit.wikimedia.org/r/548916

DannyS712 closed this task as Resolved.EditedMar 26 2020, 5:05 AM

Didn't register that this was done
Should this be (have been) announced in tech news?