Page MenuHomePhabricator

Audit GrowthExperiments for XSS vulnerabilities
Closed, ResolvedPublicSecurity

Description

Few moments ago, I filled tasks for a couple of XSS vulnerabilities found in GrowthExperiments. SInce there's at least three of them (and I could probably find more, as it didn't take much effort after I spotted the initial one, T289063), I'm also filling an auditing task to systematically look for similar pattern in the rest of GrowthExperiments codebase, to make sure nothing was missed.

The general pattern is that DashboardModule abstract class, that's used for creating modularized dashboard special pages calls methods like getFooter, which is then displayed in a standard place of the module. For getFooter(), the output ends up in the footer.

The caveat is that those method are expected to output raw HTML, rather than strings. It's possible to output raw messages, but in that case, ->parse() or ->escaped() must be used. This caused several XSS vulns in the codebase.

IMO, that pattern is not ideal -- it's very easy to make a mistake. I think that instead, something like HtmlArmor should be used to mark HTML that's intentionally raw HTML, treating everything else as text that needs to be escaped.

Event Timeline

Urbanecm_WMF closed subtask Restricted Task as Invalid.Aug 17 2021, 3:09 PM

One thing to audit here is why the phan-taint plugin did not flag these issues (which seem fairly straightforward).

One thing to audit here is why the phan-taint plugin did not flag these issues (which seem fairly straightforward).

I'm not sure this type of check is fairly straight-forward, at least not for phan-taint-check. There may be a depth issue going on in this case as there are at least a few layers of class/interface/function obfuscation going on before the messages from the various header and subheader content methods are actually rendered. In contrast, phan-taint-check finds these when they are immediately echo'd, as in this test patch set: https://gerrit.wikimedia.org/r/713693. @Daimona might have more thoughts on this, as they are far more intimate with the phan-taint-check code at this point. But again, this may or may not be an easy fix for phan-taint-check.

One thing to audit here is why the phan-taint plugin did not flag these issues (which seem fairly straightforward).

I'm not sure this type of check is fairly straight-forward, at least not for phan-taint-check. There may be a depth issue going on in this case as there are at least a few layers of class/interface/function obfuscation going on before the messages from the various header and subheader content methods are actually rendered. In contrast, phan-taint-check finds these when they are immediately echo'd, as in this test patch set: https://gerrit.wikimedia.org/r/713693. @Daimona might have more thoughts on this, as they are far more intimate with the phan-taint-check code at this point. But again, this may or may not be an easy fix for phan-taint-check.

I'm going to reply taking T289063 as an example, particularly the one in Resources caused by the growthexperiments-mentor-dashboard-resources-intro message, but I think the same applies to all other issues. Essentially, taint-check doesn't use information from methods declared in subclasses; that is, the following is a minimum test that reproduces this issue:

class Main {
	function doFoo() {
		echo $this->getBaz(); // Should report an XSS here, but it doesn't
	}
	function getBaz() {
		return 'x';
	}
}

class Child extends Main {
	function getBaz() {
		return $_GET['x'];
	}
}

This seems to be a limitation of phan as well, see demo. I'm not even sure if phan offers an API for retrieving a list of subclasses of a given class. If it does, then resolving this would be easy, but then I'd be concerned about performance.

I've created T289314 anyway.

Per the discussion in T289314, we should probably add @return-taint html to the appropriate DashboardModule methods.

Per the discussion in T289314, we should probably add @return-taint html to the appropriate DashboardModule methods.

+1

Would it make sense to note the subclass issue and task here: https://www.mediawiki.org/wiki/Continuous_integration/Phan#Known_Problems ?

I'd say even the taint-check README would make sense; I'll be happy to review if you want to send a PR. Also, I'm not 100% sure, but I looked again and I think phan doesn't provide any way to list the subclasses of a given class, nor does it seem to store this information internally, so it should be noted that this is an upstream limitation. Checking up with upstream could also make sense.

Noting here that while T289314 does solve the bigger issue, the specific example in GrowthExperiments still goes undetected. That's essentially blocked on r681443, since you're calling buildElement() which calls rawElement() which preserves the taintedness of the content.

Urbanecm changed the visibility from "Custom Policy" to "Public (No Login Required)".Sep 9 2021, 7:46 PM
Urbanecm changed the edit policy from "Custom Policy" to "All Users".

Task published, as there's no private info apart from what is in the (now public) subtasks.

kostajh claimed this task.

I'm tentatively marking this as resolved, although if there are more specific tasks we could reopen.

The specific task would be to search for all lines where we transform a message without escaping (so ->text() and ->plain() in PHP, .plain(), .text() and .msg(...) in Javascrit) and make sure they are handled correctly.