Page MenuHomePhabricator

Consider the taintedness of methods declared in subclasses when analyzing method calls
Open, Needs TriagePublicBUG REPORT

Description

Test case:

class Main {
	function doFoo() {
		echo $this->getBaz(); // Should report an XSS here, but it doesn't
	}
	function getBaz() {
		return 'x';
	}
}

class Child extends Main {
	function getBaz() {
		return $_GET['x'];
	}
}

This seems to be a limitation of phan as well, see demo. I'm not even sure if phan offers an API for retrieving a list of subclasses of a given class. If it does, then resolving this would be easy, but then I'd be concerned about performance.

Upstream issue: https://github.com/phan/phan/issues/4502

Event Timeline

Would it be possible to taint from both ends? Ie. when you hit echo $this->getBaz(), the method is marked as something that outputs raw HTML and must have an untainted return value, and then that marking is inherited by subclasses?

Also, is it possible as a temporary (or non-temporary) workaround to manually mark the parent method as requiring an untainted return? From the documentation it seems like @return-taint exec_html would do the trick. (This is of course a lot more error-prone than fully automated taint detection but better than nothing.)

Would it be possible to taint from both ends? Ie. when you hit echo $this->getBaz(), the method is marked as something that outputs raw HTML and must have an untainted return value

We already do this, so we can flag things like

class Main {
	public $baz;

	function getBaz() {
		return $this->baz;
	}
	function setBaz($v) {
		$this->baz = $v;
	}

	function doFoo() {
		echo $this->getBaz();
	}

}

$main = new Main;
$main->setBaz( $_GET['evil'] );

and then that marking is inherited by subclasses?

However, it doesn't propagate to subclasses currently.

Also, is it possible as a temporary (or non-temporary) workaround to manually mark the parent method as requiring an untainted return? From the documentation it seems like @return-taint exec_html would do the trick. (This is of course a lot more error-prone than fully automated taint detection but better than nothing.)

Yep! The correct syntax is @return-taint html, EXEC flags aren't supported in return-taint because the semantic would be unclear.

Change 714178 had a related patch set uploaded (by Daimona Eaytoy; author: Daimona Eaytoy):

[mediawiki/tools/phan/SecurityCheckPlugin@master] [WIP] Subclasses

https://gerrit.wikimedia.org/r/714178

Quick update. The good news is that the PR to expose method overrides was merged upstream, which means we can use it as long as phan 5.1.1 is released and we start using it. The bad news is that, possibly because I haven't touched it for a while, implementing this in taint-check is not as easy as I thought. In fact, it's complicated by the following things:

  • The relevant code in taint-check is a bit convoluted, having several levels of get, set, get-and-set, merge, all accounting for every different way that a function can become tainted (as analysis result, hardcoded in the plugin, docblock annotations, ...). I'm still not 100% sure where the merge-with-overrides thingy should happen.
  • Merging information from subclasses will certainly make the plugin slower and more memory-hungry. I believe this is totally acceptable, but the impact should nonetheless be limited as much as possible.
  • Some approaches are vulnerable to the following false positive:
class Class1 {
   static function getVal() {
      return 'safe';
   }
}
class Class2 extends Class1 {
   static function getVal() {
      return $_GET['unsafe'];
   }
}

echo Class1::getVal(); // This shouldn't be reported

and avoiding that might be a bit difficult depending on the approach.

Change 714630 had a related patch set uploaded (by SBassett; author: SBassett):

[mediawiki/tools/phan/SecurityCheckPlugin@master] Add known issue of Phan not having a subclass API to README

https://gerrit.wikimedia.org/r/714630

Change 714630 merged by jenkins-bot:

[mediawiki/tools/phan/SecurityCheckPlugin@master] Add known issue of Phan not having a subclass API to README

https://gerrit.wikimedia.org/r/714630

I don't actually think @return-taint html is a solution to this problem. That makes the function to be considered to return html regardless of what its input parameters, where what is needed is to consider the function to be always called even if no call is detected anywhere.

I don't actually think @return-taint html is a solution to this problem. That makes the function to be considered to return html regardless of what its input parameters, where what is needed is to consider the function to be always called even if no call is detected anywhere.

I don't remember the details of the conversation above, but yeah... I believe what I wrote in T289314#7301925 still holds true. I'm particularly concerned about the increase in time and memory, both of which are already on the high side of things. I can't think of low hanging fruits that could yield decent performance improvements without also reducing the quality. I've been staring at profiles yesterday but did not identify anything obvious (I may just need to look better, though!). And to make things worse, there are two changes (r906771 and especially r681443) that I would really like to finish and put in review before working on this task, but the impact of those changes on the performance of the plugin is pretty much unacceptable already :-/