Page MenuHomePhabricator

make phan-taint-check handle array_map
Open, Needs TriagePublic

Description

E.g. line 865 DatabasePostgres.php, of which a simplified version is:

$arr = some_unsafe_data_in_an_array;
$list = implode( ',', array_map( [ $this, 'addQuotes' ], $arr ) );
$this->query( "SELECT * FROM tb WHERE fld IN ($list)" );

Event Timeline

So, phan does this with an ad-hoc internal plugin (ArrayReturnTypeOverridePlugin, source), where it keeps a map of array functions and closures that applies their effects on the array shape.

I think we should adopt a similar approach, i.e. hook into getTaintOfFunctionPHP and override the taint for array functions. The list used by phan can be used as a baseline for what functions we should handle and what's their effect, but aside from that we cannot reuse anything because phan works on union types, and we work on taintedness.

It goes without saying that any progress in this task should not happen before T253875.

Given the recent improvements re tracking array shapes, I think it might be a good time to do this. I still think that we should keep a list of "special-cased" internal functions, and check that list at the beginning of getTaintOfFunctionPHP. For array_map, it would be something similar to handling a call to the provided callback, passing the following arguments to array_map as arguments to the callback. I'll try this soon-ish.

This would be nice to have, it comes up as an annoyance fairly often, e.g. https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1000243

For reference, the current behavior of array_map is implemented here: "For now we only preserve taintedness of array arguments." https://gerrit.wikimedia.org/g/mediawiki/tools/phan/SecurityCheckPlugin/+/01fb7e924a2e9fa58e88eda61775feccee38aad6/src/TaintednessBaseVisitor.php#2318

Very old and experimental patch: https://gerrit.wikimedia.org/r/c/mediawiki/tools/phan/SecurityCheckPlugin/+/656523

I'd say the only useful thing from that patch are the test cases, as the actual code needs to be rewritten from scratch. I'm not sure if this would be easier to do now than in 2021.