Page MenuHomePhabricator

CheckUser: Detected potential trojan source attack with unicode bidi introduced in this code
Closed, ResolvedPublicSecurity

Description

While running npm run test on CheckUser I saw:

/Users/kostajh/src/mediawiki/w/extensions/CheckUser/i18n/api/fa.json
  13:79  warning  Detected potential trojan source attack with unicode bidi introduced in this code: '"محدودهٔ زمانی دادهٔ کاربر (مثلاً «-2 weeks» یا «2 weeks ago»)."'          security/detect-bidi-characters
  13:94  warning  Detected potential trojan source attack with unicode bidi introduced in this code: '"محدودهٔ زمانی دادهٔ کاربر (مثلاً «-2 weeks» یا «2 weeks ago»)."'          security/detect-bidi-characters
  25:83  warning  Detected potential trojan source attack with unicode bidi introduced in this code: '"نمایش بازرسی‌های 192.0.2.0/24 پس از 2011-10-15T23:00:00Z"'                 security/detect-bidi-characters
  27:77  warning  Detected potential trojan source attack with unicode bidi introduced in this code: '"باید بازهٔ زمانی درست استفاده کنید (نظیر «2 weeks ago» یا «-2 weeks»)."'  security/detect-bidi-characters
  27:95  warning  Detected potential trojan source attack with unicode bidi introduced in this code: '"باید بازهٔ زمانی درست استفاده کنید (نظیر «2 weeks ago» یا «-2 weeks»)."'  security/detect-bidi-characters

/Users/kostajh/src/mediawiki/w/extensions/CheckUser/i18n/fa.json
  22:291  warning  Detected potential trojan source attack with unicode bidi introduced in this code: '"این ابزار تغییرات اخیر را برای به دست آوردن نشانی‌های آی‌پی استفاده شده توسط یک کاربر و یا تعیین ویرایش‌ها و اطلاعات کاربری مرتبط با یک نشانی آی‌پی جستجو می‌کند.\nکاربرها و ویرایش‌های مرتبط با یک نشانی آی‌پی را می‌توان با توجه به اطلاعات سرآیند اکس‌اف‌اف (با افزودن «/xff» به انتهای نشانی آی‌پی) پیدا کرد.\nهر دو پروتکل IPv4 (معادل CIDR $1-32) و IPv6 (معادل CIDR $2-128) توسط این ابزار پشتیبانی می‌شوند.\nبنا به دلایل عملکردی، بیش از ۵۰۰۰ ویرایش بازگردانده نمی‌شود.\nاز این ابزار طبق سیاست‌ها استفاده کنید."'  security/detect-bidi-characters
  22:292  warning  Detected potential trojan source attack with unicode bidi introduced in this code: '"این ابزار تغییرات اخیر را برای به دست آوردن نشانی‌های آی‌پی استفاده شده توسط یک کاربر و یا تعیین ویرایش‌ها و اطلاعات کاربری مرتبط با یک نشانی آی‌پی جستجو می‌کند.\nکاربرها و ویرایش‌های مرتبط با یک نشانی آی‌پی را می‌توان با توجه به اطلاعات سرآیند اکس‌اف‌اف (با افزودن «/xff» به انتهای نشانی آی‌پی) پیدا کرد.\nهر دو پروتکل IPv4 (معادل CIDR $1-32) و IPv6 (معادل CIDR $2-128) توسط این ابزار پشتیبانی می‌شوند.\nبنا به دلایل عملکردی، بیش از ۵۰۰۰ ویرایش بازگردانده نمی‌شود.\nاز این ابزار طبق سیاست‌ها استفاده کنید."'  security/detect-bidi-characters

/Users/kostajh/src/mediawiki/w/extensions/CheckUser/i18n/he.json
   38:30  warning  Detected potential trojan source attack with unicode bidi introduced in this code: '"API: $1"'                         security/detect-bidi-characters
  102:54  warning  Detected potential trojan source attack with unicode bidi introduced in this code: '"בערך $1 מכל המשתמשים ב־/64 הזה"'  security/detect-bidi-characters

I don't know if this is a legitimate issue; filing this task so we can follow up on it.

Details

Risk Rating
Low
Author Affiliation
WMF Product

Event Timeline

@Mooeypoo @Amire80 Have you encountered anything like this elsewhere?

The usage of the characters in these strings is legitimate. Feel free to call me biased because I translated the Hebrew ones (probably all of them), and while I don't know Persian, I'm quite sure that they are legit there, too.

In Hebrew, it's probably possible to avoid using them by writing some parts of messages in HTML tags like <bdi> or <span> (with a dir attribute). That, however, may introduce some other complications, because some messages don't allow HTML, and it's very hard which ones those are from our translation interface.

I'm a localization person and I'm really not a security person. If any security people think that this may be a security threat, it's possible to use HTML instead (certainly in Hebrew, probably also in Persian), but as I said, it may cause some other complications.

If it's not really a threat, perhaps the i18n/*.json files can be excluded from these checks. (Or at least the localization files for RTL language.)

100% agree with @Amire80 that these are non-issues, from a security standpoint.

What I believe this stems from is that we added the eslint-plugin-security to the wikimedia eslint config a while back, see: https://github.com/wikimedia/eslint-config-wikimedia/pull/502 and T328568. Some of the rules, notably the bidi detection, were a bit overzealous within the context of Wikimedia code, particularly the standard json i18n/l10n files. Those checks were disabled via https://github.com/wikimedia/eslint-config-wikimedia/pull/502, so this may just be an issue of needing to pull the latest tag: https://github.com/wikimedia/eslint-config-wikimedia/releases/tag/v0.25.1.

Got it, thanks for checking. Indeed, updating to eslint-config-wikimedia@0.25.1 results in the warnings disappearing. I've made a patch for that in T338773: LibraryUpgrader is not updating CheckUser. I think we can remove the Security tag from this.

Sounds good, thanks! To be fair, MediaWiki's i18n/l10n files would still be a potential trojan source injection point. It's just that the signal-to-noise ratio for most automated tooling would be too low to be truly effective for many of these files. So we just need to exercise some extra diligence during CR, etc. when reviewing i18n/l10n files.

sbassett triaged this task as Low priority.
sbassett changed Author Affiliation from N/A to WMF Product.
sbassett changed the visibility from "Custom Policy" to "Public (No Login Required)".
sbassett changed the edit policy from "Custom Policy" to "All Users".
sbassett changed Risk Rating from N/A to Informational.
sbassett changed Risk Rating from Informational to Low.
sbassett edited projects, added SecTeam-Processed; removed Security-Team.

Although it is debatable how practical the whole BIDI trojan thing is in general, it should be noted that the alert above seems to be alerting on LRM/RLM. It seems like it would be very impossible to pull off this type of attack with those characters. Perhaps the script should be limited to only look for RLO U+202E and LRO U+202D (If you want to be really paranoid include RLI, LRI, RLE and LRE), which seem like the only applicable BIDI control characters [And conveniently, are also not normally used in translations unless you are doing something very weird]

Although it is debatable how practical the whole BIDI trojan thing is in general, it should be noted that the alert above seems to be alerting on LRM/RLM. It seems like it would be very impossible to pull off this type of attack with those characters. Perhaps the script should be limited to only look for RLO U+202E and LRO U+202D, which seem like the only applicable BIDI control characters [And conveniently, are also not normally used in translations unless you are doing something very weird]

This would probably involve writing our own eslint-plugin-security rule, as I doubt that is configurable out of the box. But it's obviously doable. And maybe not too difficult, given what the existing rule is already doing.