Page MenuHomePhabricator

Reduce the number of messages that trigger false positive in security check to speed up manual review
Closed, ResolvedPublic2 Estimated Story Points

Description

There is a list of (some) problematic messages that slow down localisation updates because they need manual review each time translation is added or updated: https://etherpad.wikimedia.org/p/i18n-check

The goal of this task is to reduce the number of such messages by ~10.

There are multiple options how to avoid false positives and the best action is determined case by case basis:

  • Remove HTML from the message. E.g. <span class="foo">bar</span> should be just bar and the wrapping HTML should be produced by the code
  • Make the HTML-lookalike content a variable, e.g. There is unbalanced <translate> tag can be changed to There is unbalanced $1 tag and <translate> is a message parameter.
  • Make the HTML-lookalike characters use a different symbol, e.g. < Go back can be 〈 Go back.
  • Add the HTML-lookalike tag to the checker allow list. E.g. <pagelist> should always be safe. See T222216: Improve i18n CI checker for an example.

Message Items

  • Social profile
  • Wikimedia Messages
  • MediaWiki core
  • Collection
  • DonationInterface
  • Math
  • PrivateDomains
  • ProofreadPage
  • TemlateData
  • Wikimedia portals
  • HeaderScript
  • Timeline
  • Flow
  • Babel
  • Translate
  • CirrusSearch
  • map-of-monuments
  • PGFTikZ
  • WikiEditor
  • mediawiki/extensions/Collection
  • CiteDrawer
  • LinkAttributes
  • UserStatus

Event Timeline

Nikerabbit set the point value for this task to 2.Mar 31 2022, 10:25 AM

Change 779015 had a related patch set uploaded (by Wangombe; author: Wangombe):

[mediawiki/extensions/Translate@master] <translate> tag: replace with message parameter

https://gerrit.wikimedia.org/r/779015

Change 786275 had a related patch set uploaded (by Wangombe; author: Wangombe):

[mediawiki/extensions/Translate@master] Update message description

https://gerrit.wikimedia.org/r/786275

Change 779015 merged by jenkins-bot:

[mediawiki/extensions/Translate@master] <translate> tag: replace with message parameter

https://gerrit.wikimedia.org/r/779015

Change 790971 had a related patch set uploaded (by Wangombe; author: Wangombe):

[mediawiki/extensions/Translate@master] Replace tags with use of parameter: $1

https://gerrit.wikimedia.org/r/790971

Change 786275 abandoned by Wangombe:

[mediawiki/extensions/Translate@master] Update message description on qqq.json

Reason:

No longer necessary

https://gerrit.wikimedia.org/r/786275

Change 790979 had a related patch set uploaded (by Wangombe; author: Wangombe):

[mediawiki/extensions/timeline@master] <timeline> tag: replace with message parameter

https://gerrit.wikimedia.org/r/790979

Change 790981 had a related patch set uploaded (by Wangombe; author: Wangombe):

[mediawiki/extensions/SocialProfile@master] Replace < with html character '&lt;'

https://gerrit.wikimedia.org/r/790981

Change 790971 merged by jenkins-bot:

[mediawiki/extensions/Translate@master] Reword messages to reduce translation ambuguity

https://gerrit.wikimedia.org/r/790971

Change 792169 had a related patch set uploaded (by Wangombe; author: Wangombe):

[mediawiki/extensions/UserStatus@master] Replace < character with &lt;

https://gerrit.wikimedia.org/r/792169

Change 792172 had a related patch set uploaded (by Wangombe; author: Wangombe):

[mediawiki/extensions/Link_Attributes@master] Replace <> with html character codes.

https://gerrit.wikimedia.org/r/792172

Change 792203 had a related patch set uploaded (by Wangombe; author: Wangombe):

[integration/config@master] Allow some more tags and attributes

https://gerrit.wikimedia.org/r/792203

Change 792203 merged by jenkins-bot:

[integration/config@master] jjb: [mediawiki-i18n-check-docker] Allow bdi, dl, dd, and ref tags

https://gerrit.wikimedia.org/r/792203

Change 792382 had a related patch set uploaded (by Wangombe; author: Wangombe):

[mediawiki/extensions/WikimediaMessages@master] Replace <> with html character codes.

https://gerrit.wikimedia.org/r/792382

Change 792449 had a related patch set uploaded (by Wangombe; author: Wangombe):

[mediawiki/extensions/DonationInterface@master] Replace use of <> with HTML character codes

https://gerrit.wikimedia.org/r/792449

Wangombe updated the task description. (Show Details)

Change 792382 abandoned by Wangombe:

[mediawiki/extensions/WikimediaMessages@master] Replace <> with html character codes.

Reason:

Out of scope. Change requires more input that may introduce breaking changes.

https://gerrit.wikimedia.org/r/792382

Change 792449 abandoned by Wangombe:

[mediawiki/extensions/DonationInterface@master] Replace use of <> with HTML character codes

Reason:

changes made would sort of solve one problem and create another.

https://gerrit.wikimedia.org/r/792449

As of this comment, the number or messages has reduced by >10. This patch handles a number of these messages by adding some tags onto the safe list.

Change 792172 merged by jenkins-bot:

[mediawiki/extensions/Link_Attributes@master] Replace <> with html character codes.

https://gerrit.wikimedia.org/r/792172

Change 792169 merged by jenkins-bot:

[mediawiki/extensions/UserStatus@master] Replace < character with &lt;

https://gerrit.wikimedia.org/r/792169

Change 799301 had a related patch set uploaded (by Wangombe; author: Wangombe):

[mediawiki/extensions/HeadScript@master] Replace <> with html character codes

https://gerrit.wikimedia.org/r/799301

abi_ added a subscriber: abi_.

I'm generally seeing less number of patches to review when automated exports run.

Leaving this open for some more time to monitor for any issues caused by this patch.

Change 799301 merged by jenkins-bot:

[mediawiki/extensions/HeadScript@master] Replace <> with html character codes

https://gerrit.wikimedia.org/r/799301

Change 790979 merged by jenkins-bot:

[mediawiki/extensions/timeline@master] <timeline> tag: replace with message parameter

https://gerrit.wikimedia.org/r/790979

Wangombe updated the task description. (Show Details)

As of this comment, I have managed to reduce the number of messages by ~13. There might be a bigger number due to the fact that one of the patches added a few items to the safe tags list.

Change 790981 merged by jenkins-bot:

[mediawiki/extensions/SocialProfile@master] Replace < with html character '&lt;'

https://gerrit.wikimedia.org/r/790981

Change 799955 had a related patch set uploaded (by Wangombe; author: Wangombe):

[mediawiki/extensions/UserStatus@master] Revert changes to use < instead of '&lt;'

https://gerrit.wikimedia.org/r/799955

Change 799955 merged by jenkins-bot:

[mediawiki/extensions/UserStatus@master] Revert change to use < instead of '&lt;'

https://gerrit.wikimedia.org/r/799955

For UserStatus, did you consider this option?

Make the HTML-lookalike characters use a different symbol, e.g. < Go back could be 〈 Go back.

For UserStatus, did you consider this option?

Make the HTML-lookalike characters use a different symbol, e.g. < Go back could be 〈 Go back.

It wasn't considered but it seems this will produce a consistent result.

Change 803863 had a related patch set uploaded (by Nikerabbit; author: Wangombe):

[mediawiki/extensions/UserStatus@master] Replace < and &lt; characters with 〈

https://gerrit.wikimedia.org/r/803863

Change 803863 merged by jenkins-bot:

[mediawiki/extensions/UserStatus@master] Replace < and &lt; characters with 〈

https://gerrit.wikimedia.org/r/803863