Page MenuHomePhabricator

Wikidata dump maintenance scripts cause HHVM to leak memory heavily, not doing GC
Closed, ResolvedPublic

Description

This has been discovered in T161577 and tested in T161577#3139796, though we still don't know the exact cause.

This has been an issue with hhvm 3.12.7 and the new hhvm 3.18.1.

Apparently this is related to HHVM not doing garbage collection: T161695#3156462.

Event Timeline

Same on mwdebug1001 with hhvm 3.18:

23336 www-data 20 0 2264696 912092 50000 R 91.5 22.5 13:00.97 php /srv/mediawiki/multiversion/MWScript.php extensions/Wikidata/extensions/Wikibase/repo/maintenance/dumpRdf.php --wiki wikidatawiki --sharding-factor 5 --shard 0 --snippet
Processed 35667 entities.

hoo@mwdebug1001:~$ php --version
HipHop VM 3.18.1 (rel)
Compiler: 3.18.1+dfsg-1+wmf1
Repo schema: 47e86d9e41dac7800783dc589035875b00f7231
hoo added subscribers: daniel, Stas.

Mentioned in SAL (#wikimedia-operations) [2017-03-29T11:30:30Z] <hoo> Started a Wikidata JSON dump run on snapshot1007 using Zend (due to T161695).

Mentioned in SAL (#wikimedia-operations) [2017-03-29T18:43:07Z] <hoo> Started a Wikidata TTL dump run on snapshot1007 using Zend (due to T161695).

I looked into this briefly today and it seems that it's not caused by any HHVM special handling within one of the Wikibase components, thus we are probably indeed dealing with a HHVM bug.

@daniel suggested I should check out whether we properly do GC in our scripts.

I tested that, by occasionally running [[https://secure.php.net/manual/de/function.gc-collect-cycles.php|gc_collect_cycles()]] within the script in question, and this actually fixes the problem.

hoo renamed this task from Wikidata dump maintenance scripts cause HHVM to leak memory heavily to Wikidata dump maintenance scripts cause HHVM to leak memory heavily, not doing GC.Apr 5 2017, 7:58 AM
hoo updated the task description. (Show Details)
hoo claimed this task.

Garbage collection for HHVM is just disabled, while it's not for Zend php5:

hoo@snapshot1007:~$ sudo -u datasets php5 /srv/mediawiki/multiversion/MWScript.php eval.php --wiki wikidatawiki
> var_dump(gc_enabled());
bool(true)

> 
hoo@snapshot1007:~$ sudo -u datasets php /srv/mediawiki/multiversion/MWScript.php eval.php --wiki wikidatawiki
> var_dump(gc_enabled());
bool(false)