Page MenuHomePhabricator

Sanitizer::stripAllTags() causing double-escape false positive from phan-taint-check
Open, NormalPublic


Noticed when backporting the patch for T229541 within gerrit. There was definitely a legitimate XSS within MobileFrontend , though phan-taint-check came back with a SecurityCheck-DoubleEscaped error:

<file name="includes/specials/MobileSpecialPageFeed.php">
    <error line="45" severity="warning" message="Calling method \htmlspecialchars() in \MobileSpecialPageFeed::formatComment that outputs using tainted argument $[arg #1]. (Caused by: includes/specials/MobileSpecialPageFeed.php +43)" source="SecurityCheck-DoubleEscaped"/>

As noted within the doc, Sanitizer::stripAllTags() output needs to be further escaped for literal html inclusion. Maybe some additional code within MediaWikiSecurityCheckPlugin.php is needed?

Event Timeline

sbassett created this task.Aug 9 2019, 7:19 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 9 2019, 7:19 PM
sbassett renamed this task from Sanitizer::stripAllTags() causing double-escape false positive in phan-taint-check to Sanitizer::stripAllTags() causing double-escape false positive from phan-taint-check.Aug 9 2019, 7:19 PM
sbassett triaged this task as Normal priority.

I cannot really check at the moment, but if taint-check can't understand the taintedness of the return value, then we should add custom annotations to the docblock of stripAllTags, instead of hardcoding stuff in taintcheck itself.

Ok, that makes sense.

First of all, MobileFrontend is reusing $comment with different levels of taintedness. The plugin is well known to be unable to handle this correctly, although the fix for that is not easy. But for this specific case, renaming the variables doesn't help.

And then, Linker::formatComment is doing the same thing, but again it doesn't matter.

As a side note, if the var reuse in Linker::formatComment had played a role, this would have been a core problem. And I believe there could be various cases like that (i.e. core causing false positives in extensions), given that taint-check is not running for core (T216348).

Anyway, the real cause here is a weird (?) bug: Linker::formatComment is returning an escaped comment. But Sanitizer::stripAllTags removes the escaping and returns tainted HTML. I didn't even check, but I bet that Remex's Tokenizer is too complicated for the plugin, and so it just thinks that the taintedness of the return value is the same as the parameter's (=escaped).

I think the best we can do is add @return-taint tainted to the docblock of Sanitizer::stripAllTags. Note that it could cause new problems to be detected, and thus CI could start failing for extensions. Unfortunately, we cannot use anything more specific: the output of stripAllTags will be tainted only (I think) if the unescaped version of the input was. But there's no way to detect it from taint-check. And actually, that's pretty bizarre... Maybe @Bawolff has better ideas?

Change 530009 had a related patch set uploaded (by SBassett; owner: SBassett):
[mediawiki/core@master] Set @return-taint of Sanitizer::stripAllTags to tainted

sbassett added a comment.EditedAug 13 2019, 10:09 PM

@Daimona - I'd be interested to hear @Bawolff's thoughts, but for now I've just submitted a patch for Sanitizer.php as you suggested. I think that's the easiest approach, unless it gets too noisy in CI, in which case we can revert. But I'd personally rather find more things than not :)

Change 530009 merged by jenkins-bot:
[mediawiki/core@master] Set @return-taint of Sanitizer::stripAllTags to tainted

I've submitted a test change to a bunch of extensions using stripAllTags to see if there were CI failures. None of them failed, so hooray. Leaving open in case some extension I didn't test is broken and/or we find a better solution.

@Daimona - Awesome, glad to hear that. Thanks!

Change 530009 merged by jenkins-bot:
[mediawiki/core@master] Set @return-taint of Sanitizer::stripAllTags to tainted

Belated, this sounds like a good idea to me. Another possibility might be to just do @return-taint html, instead of flagging all taint types.