Page MenuHomePhabricator

IPSet::__construct() in gets into infinite loop when called from curl on a CI host
Closed, ResolvedPublic

Description

IP::isPublic() calls

new IPSet( array(
	'10.0.0.0/8', # RFC 1918 (private)
	'172.16.0.0/12', # RFC 1918 (private)
	'192.168.0.0/16', # RFC 1918 (private)
	'0.0.0.0/8', # this network
	'127.0.0.0/8', # loopback
	'fc00::/7', # RFC 4193 (local)
	'0:0:0:0:0:0:0:1', # loopback
	'169.254.0.0/16', # link-local
	'fe80::/10', # link-local
) );

which ends up in an infinite loop when Jenkins calls various things via curl for the qunit tests:
https://integration.wikimedia.org/ci/job/mediawiki-extensions-qunit/30149/console

1
2( ! ) Fatal error: Maximum function nesting level of '100' reached, aborting! in /mnt/jenkins-workspace/workspace/mediawiki-extensions-qunit/src/vendor/wikimedia/ip-set/src/IPSet.php on line 219
3Call Stack
4#Time Function Location
51 {main}( )../index.php:0
62 require( '/mnt/jenkins-workspace/workspace/mediawiki-extensions-qunit/src/includes/WebStart.php' )../index.php:40
73 require_once( '/mnt/jenkins-workspace/workspace/mediawiki-extensions-qunit/src/includes/Setup.php' )../WebStart.php:152
84 MediaWiki\Session\SessionManager->checkIpLimits( )../Setup.php:814
95 IP::isPublic( )../SessionManager.php:1075
106 IPSet\IPSet->__construct( )../IP.php:391
117 IPSet\IPSet::recOptimize( )../IPSet.php:104
128 IPSet\IPSet::recOptimize( )../IPSet.php:220
139 IPSet\IPSet::recOptimize( )../IPSet.php:220
1410 IPSet\IPSet::recOptimize( )../IPSet.php:220
1511 IPSet\IPSet::recOptimize( )../IPSet.php:220
1612 IPSet\IPSet::recOptimize( )../IPSet.php:220
1713 IPSet\IPSet::recOptimize( )../IPSet.php:220
1814 IPSet\IPSet::recOptimize( )../IPSet.php:220
1915 IPSet\IPSet::recOptimize( )../IPSet.php:220
2016 IPSet\IPSet::recOptimize( )../IPSet.php:220
2117 IPSet\IPSet::recOptimize( )../IPSet.php:220
2218 IPSet\IPSet::recOptimize( )../IPSet.php:220
2319 IPSet\IPSet::recOptimize( )../IPSet.php:220
2420 IPSet\IPSet::recOptimize( )../IPSet.php:220
2521 IPSet\IPSet::recOptimize( )../IPSet.php:220
2622 IPSet\IPSet::recOptimize( )../IPSet.php:220
2723 IPSet\IPSet::recOptimize( )../IPSet.php:220
2824 IPSet\IPSet::recOptimize( )../IPSet.php:220
2925 IPSet\IPSet::recOptimize( )../IPSet.php:220
3026 IPSet\IPSet::recOptimize( )../IPSet.php:220
3127 IPSet\IPSet::recOptimize( )../IPSet.php:220
32~ ~
33~ ~
34 NORMAL   +                                                                                                                                                         unix   13%     4:15 
35378 substitutions on 31 lines

The same call works fine in other contexts on the same machine, and via curl on my local machine. IPSet does not have configuration or any other kind of environmental dependencies.

Related Objects

Event Timeline

Tgr raised the priority of this task from to Needs Triage.
Tgr updated the task description. (Show Details)
Tgr subscribed.

Change 269900 had a related patch set uploaded (by Gergő Tisza):
Use IP::isPublic() in SessionManager::checkIpLimits()

https://gerrit.wikimedia.org/r/269900

Looking at the logs it seems that the bit tree used to store the address ranges is legitimately >100 levels deep so probably the recursion limits are somehow lower for CI web requests.

+ @BBlack who wrote the IPSet matching code.

No idea how the code work but an IPv6 is 128 bits long. Looking at the log output from Vagrant the array indeed has ~128 levels.

On both Precise and Trusty instances the Zend PHP has xdebug.max_nesting_level => 100 => 100. Beeem test fail.

So something in your code triggers a nasty nested array which apparently is not covered by IPSet tests, or IPSet would have failed tests running on CI with xdebug.max_nesting_level=100.

Would be nice to gather more evidence. The parameter being passed to IP::isPublic() would help maybe. It should really be 127.0.0.1 though.

Yes, the code requires >100 levels of recursion.

https://github.com/wikimedia/IPSet/blob/master/tests/IPSetTest.php#L35 is where the unit test workaround is at, it probably needs updating (as do the settings for runtime, wherever IPSet is used).

So something in your code triggers a nasty nested array which apparently is not covered by IPSet tests, or IPSet would have failed tests running on CI with xdebug.max_nesting_level=100.

I note that phpunit.php also sets xdebug.max_nesting_level higher for the phpunit tests, and the phpunit tests in Gergő's patch are passing. The failing tests here are qunit, not phpunit.

Presumably the existing qunit tests just aren't sending any requests with an X-Forwarded-For header (which seems to be the only other thing using IP::isPublic()) or hitting anything else that constructs an IPSet with IPv6 addresses. Gergő's addition here calls IP::isPublic() on every request, so every qunit test is going to hit it.

Would be nice to gather more evidence. The parameter being passed to IP::isPublic() would help maybe. It should really be 127.0.0.1 though.

Since it's happening in the IPSet constructor (which doesn't get passed the $ip) rather than the call to match(), that's unlikely to matter.

So we need the local webhost used by QUnit to have max_nesting_level.

The apache vhost is setup via modules/contint/manifests/localvhost.pp. So we can bump the xdebug value either:

A) directly in the Apache vhost by editing modules/contint/templates/apache/localvhost.erb

B) Add a new file under /etc/php5/apache2/conf.d/

Maybe large IPSet rules should be precompiled? 100+ levels of recursion sounds expensive...

It's only ~128 levels, and it's not that expensive really. And any pre-compilation of course still happens under the same interpreter with the same constraint that the recursion-level setting needs appropriate adjustment.

More to the point, though, the constructor is effectively the pre-compilation step. The intent is you construct an IPSet object infrequently (e.g. when a process starts, or on a configuration change), and then just invoke the object's match method for heavy runtime traffic.

meta-note: it would be best if IPSet's documentation made a note that it requires the interpreter/engine config change for recursion limits...

Circling back around to this and looking again: of course it's possible that a better answer here is to refactor recOptimize and recCompress to use loops in place of recursion. In theory, recursive things can always be rewritten as loops that maintain some kind of state-stack variable, e.g. an array stack that grows by one entry for each virtual level of recursion in the loop.

The upside is it could remove the constraint that using this module requires setting a special interpreter/engine config variable in the common case. The primary potential downside is that the code might get more-unreadable (and it's already pretty bad). The other potential fallout is of course the perf impact of that refactor is an unknown, but I'd expect perf improvement, and in the constructor case perf isn't as important as it is in match()

Circling back around to this and looking again: of course it's possible that a better answer here is to refactor recOptimize and recCompress to use loops in place of recursion. In theory, recursive things can always be rewritten as loops that maintain some kind of state-stack variable, e.g. an array stack that grows by one entry for each virtual level of recursion in the loop.

The upside is it could remove the constraint that using this module requires setting a special interpreter/engine config variable in the common case. The primary potential downside is that the code might get more-unreadable (and it's already pretty bad). The other potential fallout is of course the perf impact of that refactor is an unknown, but I'd expect perf improvement, and in the constructor case perf isn't as important as it is in match()

Soon after IPSet was added to core, I worked on a patch that avoids the recursion, though never uploaded it to Gerrit. Actually, I had multiple versions of the patch that took different approaches. I'll see if I can update one of them for the current version of IPSet.

Change 270301 had a related patch set uploaded (by PleaseStand):
Optimize and compress the tries as nodes are added

https://gerrit.wikimedia.org/r/270301

Change 269900 abandoned by Gergő Tisza:
Use IP::isPublic() in SessionManager::checkIpLimits()

Reason:
checkIpLimits was removed.

https://gerrit.wikimedia.org/r/269900

Change 270301 merged by jenkins-bot:
Optimize and compress the tries as nodes are added

https://gerrit.wikimedia.org/r/270301

My understand is the patch @PleaseStand provided on https://gerrit.wikimedia.org/r/#/c/270301/ and approved by @BBlack make it so the level of recursion has been dramatically lowered and that would make it work for folks having the default xdebug setting of 100 levels. That is frankly quite awesome. Thank you all !

I would like a new version of IPSet to be cut and MediaWiki core to included before 1.27. I am filling sub tasks.

hashar triaged this task as High priority.Feb 26 2016, 9:53 AM
hashar removed a project: Patch-For-Review.

Fixed as of IPSet 1.1.0 which is included in MW-1.27-release .