the plugin is built on top of phan 0.8.0 which is pretty old now. It would be good to update it on top of a modern version of phan, probably 1.3.1.
- Mentioned In
- T207344: Phan-taint-check-plugin not available for PHP > 7.0
T218719: Upgrade php-ast to 1.0.1 in CI composer-package containers
T203651: Optimize phan-taint-check speed
- Mentioned Here
- T218719: Upgrade php-ast to 1.0.1 in CI composer-package containers
T207344: Phan-taint-check-plugin not available for PHP > 7.0
This would indeed be awesome. However, see my comment in T207344#5033198. Basically, phan has many breaking changes even in patch versions, so upgrading to 0.8.13 would already break a lot of stuff. Today I tried to get there, but eventually gave up due to this.
For whoever will want to proceed with this, here are the main obstacles I found while upgrading to 0.8.13. Note that I don't really know how phan works so it could be fault of mine (but I'm still happy to help).
- Clazz::getPropertyByNameInContext now has an extra required parameter, $is_static. However, I couldn't find a way to determine if the property is static where we call it.
- This commit causes 2 troubles:
- It adds the new class UnaddressableTypedElement, which is extended by Variable. This breaks all typehints for TypedElementInterface in TaintednessBaseVisitor. Just removing them (and adding UnaddressableTypedElement to the docblock) seems to work fine, but it should be checked more deeply.
- Parameter::getContext() is replaced by Parameter::getFileRef(), which however doesn't have the getScope() method used in setTaintedness.
There are some other changes too, but all of them are easy to do.
After 0.8.13 one may upgrade to 0.9.6. I didn't really check it, but it sounds like there aren't many differences between those, so it's likely that the upgrade will be possible without changing anything.
At this point the plugin should be upgraded to PluginV2, as support for old-style plugins was dropped in 0.9.7 (once again, that's what I meant with "phan doesn't comply with semver"). I think this will require a decent amount of work, but it should make the plugin execute faster thanks to the new design.
After 0.9.7, I don't know, but I guess that the road to 1.2.6 will be pretty long.
Please note that all of the above assumes the goal to just make seccheck work with new phan versions. Using new phan methods to speed up the analysis/make it more precise/avoid hacks/you name it is a whole different story.
At any rate, IMVHO we should stop adding any new feature to seccheck before upgrading to 1.2.x, as it's likely that every addiction will add extra code to be reworked.
So I'm looking at the upgrade to PluginV2. The main change is that *Visitor classes are now instantiated by phan itself, which only wants a "get*ClassName" method instead of a constructor. This mostly means that we cannot pass in an instance of the plugin anymore. Moreover, Visitor and PreVisitor must inherit from different classes, which prevents them from having TaintednessBaseVisitor as common parent (should be fixable by turning TaintednessBaseVisitor into a trait). Plus, the MediaWiki checker runs both the generic visitor and the MW visitor, which again isn't possible in V2 because you can only provide a single class name. I'll probably do some gradual changes before the v2 upgrade.
Great! Thanks @sbassett and @Bawolff! I guess we're currently blocked on T218719 which makes https://gerrit.wikimedia.org/r/#/c/mediawiki/tools/phan/SecurityCheckPlugin/+/497583/ fail, plus any further fix which I may have missed and will show up from CI.
Is it helpful to see a bunch of failed instances of composer-package-php70-docker? I'm not sure how valuable that information would be. And the thought here would be to eventually use the composer-package-php7(2|3)-docker whenever we're ready.
It depends on the reason of the failure :-) Right now it fails because it has ast 0.1.2 installed, which is a very old version anyway, so it'd be better to just upgrade it. But yes, we'll have to use php7. later on due to different phan requirements.
Yay! but please please don't create branches that have the same exact format as tags - composer is going to get confused and now we can't actually have a 2.0.0 release. Typically we use "2.x" or something.
The patch above should be the last one for this task, and it upgrades phan to 1.3.1 and ast to 1.0.1. I guess at this point we should re-enable CI for the 2.x branch, having PHP 7.2 and ast 1.0.1 there. Then, check if everything works as expected and fix whatever doesn't. Of course I can do that, aside from the CI changes.
Now I'm working on fixing the fresh code (i.e. make tests pass). It's not quick, but I have already identified a major problem: the bit of code for passByReference params (in TaintednessBaseVisitor::handleMethodCall) stopped working between 0.8.0 and 0.8.6. That piece of code already had a couple of fixme and todos saying that it had to be rewritten, so I guess the moment has come... The main problem here is that $func->getInternalScope()->getVariableByName( $param->getName() ) now returns an instance of Parameter, which doesn't have the taintedness of the var, while it used to be a PassByReferenceVariable holding much more info (including the taintedness).
At the point of https://gerrit.wikimedia.org/r/#/c/mediawiki/tools/phan/SecurityCheckPlugin/+/506124/, there are 14 integration tests failing:
arraynumkey, callbackhook, getqueryinfo, json, maintenance, makelist, multiassign, refecho, refescape, refviaparam, refwrongesc, skinjson, taghook, unserialize
The ref* ones are almost surely due to the passByRef problem. The 10 remaining ones, I don't know. Anyway, I'd like to delay any further investigation until that bit of code is rewritten. @Bawolff could you please take a look?
I took a look anyway, and actually several other failures are due to reference params (mostly in hooks). The one in maintenance is fixed by https://gerrit.wikimedia.org/r/#/c/mediawiki/tools/phan/SecurityCheckPlugin/+/506375/. The multiassign one happens because of https://github.com/phan/phan/commit/516170dda4d5e9ed6b0ad10c857cc48c1c393588, specifically because global variables are injected in the local scope (instead of local variables); I'm still investigating this one.
https://gerrit.wikimedia.org/r/#/c/mediawiki/tools/phan/SecurityCheckPlugin/+/506491/ should partly fix the reference trouble, and depends on phan 1.3.2. However, the refescape test is still failing for the same root cause.
A little update: I managed to fix several regressions. Right now, only 5 tests are failing: callbackhook, json, multiassign, registerhook and skinjson. For some of these, the cause is https://gerrit.wikimedia.org/r/#/c/mediawiki/tools/phan/SecurityCheckPlugin/+/506491/ which needs to be partly rewritten. Probably, this affects json, registerhook and skinjson (which use passbyref). The reason which makes multiassign fail is already at T216974#5137415 (I didn't investigate any more). No info about callbackhook.
EDIT: callbackhook was easy, https://gerrit.wikimedia.org/r/#/c/mediawiki/tools/phan/SecurityCheckPlugin/+/506870/. So 4 failing tests remaining, probably all due to passbyref.
An almost-final update: every integration test is passing now. I also took the chance for a change which isn't totally related to the upgrade, https://gerrit.wikimedia.org/r/#/c/mediawiki/tools/phan/SecurityCheckPlugin/+/506869/. See the commit message about why it's included there. So, as of https://gerrit.wikimedia.org/r/#/c/mediawiki/tools/phan/SecurityCheckPlugin/+/506124/, CI can be re-enabled (after T218719 is resolved).
Now I started running this new version on AbuseFilter (which passes with seccheck 1.5.1) to see if we missed any regression. In fact, I found one with branch scopes (fixed in https://gerrit.wikimedia.org/r/#/c/mediawiki/tools/phan/SecurityCheckPlugin/+/507015/) and I now see just 4 DoubleEscaped warnings all due to OOUI\Widget. I'll try to see if those are other regressions or actual issues, then test the plugin again on other extensions to ensure everything is working.
After that, I think the 2.x branch will be ready for release.
Likely-final update: I fixed another couple of regressions, and tested the plugin on a bunch of MW extensions, results below. NOTE: I used random versions of the extensions (i.e. for some of them it could be an old version).
- AbuseFilter --> 4 DoubleEscaped OOUI warnings
- CheckUser --> 0 warnings
- CentralNotice --> 0 warnings
- CodeEditor --> 0 warnings
- ContentTranslation --> 1 XSS warning (unverified)
- Echo --> Several warnings. Some are OOUI widgets, others are unverified.
- Flow --> 0 warnings
- MassMessage --> 2 warnings, unverified
- MobileFrontend --> 4 DoubleEscaped OOUI warnings
- SpamBlacklist --> 1 warning, unverified
- Thanks --> 0 warnings
- TitleBlacklist --> 0 warnings
- VisualEditor --> 3 DoubleEscaped OOUI warnings
Overall, I think it would be an acceptable amount of warnings, given that some of them may be actual issues. IF it wasn't for the OOUI false positives. I didn't check deeply why those happen, but I discovered that they disappear by removing the "param-taint" annotation in OOUI. I guess a solution could be to avoid emitting the issue if it's a DoubleEscaped issue from OOUI, and then remove annotations from OOUI itself once all extensions will use seccheck 2.0. I'm going to send a patch for that.
That being said, the whole set of patches in 2.x is ready for review, and I think 2.0 can be released as soon as they're all merged. Fixing other seccheck bugs and FIXME/TODO comments can be done later.
I checked the OOUI thingy and yes, it's a false positive due to annotations. Actually, it's the same issue mentioned by @Bawolff in the commit message of . Now it also affects other widgets. Given that no XSS is reported without those annotations (which is nice!), I guess that our best bet is indeed to merge the exclusion hack  and remove the annotations when all extensions will use seccheck 2.x.
With this, I confirm that 2.x is ready for release. It's not perfect, of course, but there's time to improve it. I'll send some other patches for various FIXMEs etc. from now on. We can decide to include them in 2.x, but that's completely optional. Should you need any clarification about my patches, want to coordinate for mass-merge, or just need another pair of eyes, I'm available; either here, via email, on IRC, hangout, whatever you prefer.
Uhm @sbassett @Bawolff I think I just realized I explained poorly. With "https://gerrit.wikimedia.org/r/507619 and its dependencies" I meant that change and the ones it depends-on... Now I see you're reviewing it the other way round, so I want to make sure you know that. Basically, the needed to be merged before release are from https://gerrit.wikimedia.org/r/#/c/mediawiki/tools/phan/SecurityCheckPlugin/+/506491/ to https://gerrit.wikimedia.org/r/#/c/mediawiki/tools/phan/SecurityCheckPlugin/+/507617/ now. I'm sorry for the misunderstanding.
@Daimona - @Bawolff and I started going the right direction this time :) Most of the patch sets looked fine, though I think @Bawolff and I would like to test them locally a bit more (especially to confirm performance under specific versions of php, php-ast and phan). @Bawolff had some concerns about https://gerrit.wikimedia.org/r/508312/2 and then we weren't sure of the state of https://gerrit.wikimedia.org/r/507015/5 and whether it needed to be revisited or abandoned at this point. I think these might be the last remaining issues before we can tag an official 2.x release.
Cool, thanks :-) The former commit (508312) needs some more discussion, as it's apparently harder than I originally thought. I'll also take a look at 507015; probably the solution I wrote in the comments is fine, and I just didn't applied it to avoid flooding zuul.
On top of that, the patch above restores tests for the branch. When 2.0 is released, my proposal (copied from https://gerrit.wikimedia.org/r/#/c/mediawiki/tools/phan/SecurityCheckPlugin/+/507617/) is to have PHP72 for the test pipeline, and 70-71-72 for gate-and submit. Until we'll upgrade phan again, and then drop 7.0 support.
507015 is now ready. It still doesn't behave as it should, but further improvement is left to be done with phan 2.2.
I also re-checked 508312. As I wrote in the commit message, the main problem of putting exec bits where nothing is exec'ed is that we'll mark all dependent methods are exec via markAllDependentMethodsExec. Thus, the exec taint could get very far from where it originated, and result in a huge amount of false positives.
Maybe a proper solution could be to add a new taint type, e.g. NonEscapableTaint. I think this wouldn't be quick, but unfortunately we cannot release 2.0 before this problem is fixed, or there'll be lots of false positives. @Bawolff Any thoughts?
@Daimona - so it sounds like r/508312 is a hard blocker for 2.x? If it's a concern about false positives I'm not sure I'd agree, if that's a current issue that's unrelated to updated versions of php, php-ast and phan. I feel like it might be more important to just get a 2.x release out the door at this point.
As for r/507015, if you think it's mergeable now, we should probably go ahead and do that, though I'm guessing from your comments above (T216974#5285258) that this isn't a hard blocker for 2.x at all.
Finally, I'm guessing that the merge of r/519190 is why we're seeing composer-package-php70-docker failures on some of these patch sets? I guess we can override and submit on a case-by-case basis, if need be.
- r508312: Yes, I believe it's a hard blocker. With the phan upgrade, many false positives started to appear due to that bug. I think that happens because the analysis is more powerful, and the previous behaviour (of not having false positives) just relied on this weakness. My concern is that without this patch, seccheck will fail for many repos due to false positives, thus making it harder to upgrade. I'm not even sure how to properly amend the patch. That said, I believe it *should* be fine to merge it for now, although not perfect: it could have false negatives, but they wouldn't block merges. And then we could push a proper fix without hurrying.
- r507015: mostly the same as above. I can't remember what kind of false positives you would get without it (and how many of them), though. But yes, that's ready for review. Improving its behaviour is something we can delay with no problems. Actually, I believe that the imperfections come from pre-existing bugs.
And yes, test failures are due to php-ast. I re-enabled CI because tests pass locally, and forgot that we still had to upgrade ast. Jerkins should be overridden for now, although I see Legoktm is working on T218719 to get it back working.