Page MenuHomePhabricator

Isolate and fix memory leaks in HHVM
Closed, ResolvedPublic

Assigned To
Authored By
Oct 23 2014, 4:12 AM
Referenced Files
F550: pasted_file
Oct 29 2014, 4:53 AM
F537: loadhtml-heap.pdf
Oct 28 2014, 5:26 PM
F492: xmlNewText_backtrace.txt
Oct 26 2014, 6:44 PM
F409: hhvm_leak.gnumeric
Oct 23 2014, 11:52 PM
F393: hhvm_leak.gnumeric
Oct 23 2014, 6:11 AM


The memory leak rate we see in production is not significantly reduced by hacking the PCRE cache. Theories needed.

We don't need complete absence of memory leaks, we just need to be able to restart HHVM once every few days instead of hourly.

Event Timeline

tstarling claimed this task.
tstarling raised the priority of this task from to High.
tstarling updated the task description. (Show Details)
tstarling added a project: MediaWiki-Core-Team.
tstarling moved this task to Backlog on the MediaWiki-Core-Team board.
tstarling changed Security from none to None.
tstarling added a subscriber: tstarling.

With the URL

and a local hack to disable the #invoke hook, confirmed that luasandbox is not at fault. About 1833 bytes/req leaked with Lua, 2887 bytes/req leaked with Lua disabled.

Data collection one liner:

while true; do  curl -s http://localhost:9002/jemalloc-stats | perl -ne 'if (/<allocated>([^<]*)/) {print localtime() . ", " . time() . ", $1\n";}' ; sleep 10; done

ab command line:

ab -X localhost:80 -c150 -n 50000  ''

Tried without JIT with the same methodology, still leaks.

Previous results were apparently garbage -- I did a longer run and it shows a plateau after about 40k requests. Previous runs stopped at 50k requests so it was easy to miss the plateau in a noisy signal. Stopping the load injection before each memory usage measurement gave very little noise and a plateau after ~16k requests.

I rebuilt libjemalloc1 with --prof-enabled:

$ apt-get source libjemalloc1
$ cd jemalloc-3.5.1
$ cat >> debian/rules
        dh_auto_configure -- --enable-prof
$ dch --distribution trusty-wikimedia --local wmf "Rebuilt with --enable-prof"
$ sudo dpkg-buildpackage -b -uc

Then I installed the resultant package on mw1114 and mw1189, the HHVM API servers. (It's on the API servers that the leak is most evident.)

I then enabled jemalloc's heap profiling capabilities via /etc/malloc.conf. The way to do it is to make that file a symlink whose target is an option string rather than an actual file. Strange!

$ ln -sf "prof:true,prof_gdump:true,lg_prof_sample:21,prof_prefix:/tmp/heaps/" /etc/malloc.conf
  • prof_gdump tells jemalloc to dump a memory profile every time the total virtual memory exceeds the previous maximum.
  • lg_prof_sample specifies the average interval (log base 2) between allocation samples, as measured in bytes of allocation activity. 21 = 2 MiB. This is granular enough to be useful while keeping the computational overhead light.
  • prof_prefix sets the filename prefix for profile dumps. If unset, dumps are written to the process's working directory -- in this case /run/hhvm.

prof_gdump ends up being less useful than I thought. Because virtual memory usage is steadily increasing, there is a new high water mark every few seconds, so I end up with a lot of noise.

The next thing I try is to use the admin server API to request a heap profile every ten minutes.

# Request a heap profile from HHVM once every ten minutes.
# Profiles are written to /tmp/heaps.

sudo -u apache mkdir -p /tmp/heaps

curl "localhost:9002/jemalloc-prof-activate"

for (( i=1; ; i++ )); do
    curl "localhost:9002/jemalloc-prof-dump?file=/tmp/heaps/${i}.heap"
    sleep 600

My plan is to use pprof's --base arg to subtract a heap profile from its successor. The basic invocation looks like this:

$ google-pprof `which hhvm` --show_bytes --base=1.heap 2.heap

This puts you in interactive mode. You can run 'top' to see the top 10 routines, etc.

I hack together a simple shell script to help:

for (( i=1; ; i++ )); do
    base="$((i)).heap" prof="$((i+1)).heap"
    [ -e "$prof" ] || break
    echo "-------------------- $base => $prof --------------------"
    google-pprof `which hhvm` --show_bytes --text --base="$base" "$prof" 2>/dev/null | head -6

Here's the output for an hour's worth of heap profiles:

-------------------- 1.heap => 2.heap --------------------
Total: 573794313 B
402653184  70.2%  70.2% 402653184  70.2% HPHP::AsyncFuncImpl::start
48235840   8.4%  78.6% 56624736   9.9% xmlNewText
20972160   3.7%  82.2% 94374384  16.4% xmlSAX2StartElement
18913696   3.3%  85.5% 18913696   3.3% 00007fe80e4343b8
16777520   2.9%  88.5% 16777520   2.9% xmlStrndup
-------------------- 2.heap => 3.heap --------------------
Total: 483619254 B
402653184  83.3%  83.3% 402653184  83.3% HPHP::AsyncFuncImpl::start
50333056  10.4%  93.7% 56624648  11.7% xmlNewText
25166592   5.2%  98.9% 25166592   5.2% xmlNewNodeEatName
23069376   4.8% 103.6% 106957784  22.1% xmlSAX2StartElement
 8388760   1.7% 105.4%  8388760   1.7% xmlStrndup
-------------------- 3.heap => 4.heap --------------------
Total: 139819215 B
44041360  31.5%  31.5% 50332872  36.0% xmlNewText
43129392  30.8%  62.3% 43129392  30.8% HPHP::MemoryManager::newSlab
12583296   9.0%  71.3% 81790932  58.5% xmlSAX2StartElement
12583296   9.0%  80.3% 12583296   9.0% xmlNewNodeEatName
12582979   9.0%  89.3% 12582979   9.0% xmlStrndup
-------------------- 4.heap => 5.heap --------------------
Total: 1258200573 B
1207959552  96.0%  96.0% 1207959552  96.0% HPHP::AsyncFuncImpl::start
33555344   2.7%  98.7% 39846844   3.2% xmlNewText
12609571   1.0%  99.7% 12609571   1.0% 00007fe80e4343b8
12605141   1.0% 100.7% 12605141   1.0% pcre_compile2
12583296   1.0% 101.7% 12583296   1.0% xmlNewNodeEatName
-------------------- 5.heap => 6.heap --------------------
Total: 129656509 B
35652544  27.5%  27.5% 37749720  29.1% xmlNewText
33176456  25.6%  53.1% 33176456  25.6% HPHP::MemoryManager::newSlab
23069376  17.8%  70.9% 23069376  17.8% xmlNewNodeEatName
14694959  11.3%  82.2% 14694959  11.3% 00007fe80e4343b8
14680512  11.3%  93.5% 81791088  63.1% xmlSAX2StartElement

Working with XML could be the most common, memory-intensive things the API servers do, so the regular appearance of xmlNewText in these lists could be innocent, but it looks suspicious to me.

I'm out of time for now, but the next thing I plan to do is set a breakpoint on xmlNewText in gdb and capture some backtraces.

The backtraces that I've captured finger c_DOMDocument::t_loadhtml (in [[|hphp/runtime/ext/ext_domdocument.cpp:3381]]) and its IDL proxy DOMDocument::loadHTML (in [[|hphp/system/idl/domdocument.idl.json:2639]]) as most suspicious. The frames above them are generic HHVM execution context things and the frames below are in libxml.

Running ApiHelp::fixHelpLinks() on some sample HTML (from leaked quite nicely. Further isolated to this:

for ( $i =0 ; $i < 1000; $i++ ) {
  $doc = new DOMDocument;
  $xpath = new DOMXPath( $doc );
  $nodes = $xpath->query( '//a' );

The leak does not occur if $xpath->query() is omitted.

Awesome, thanks for the investigation. I can take a look at fixing this unless one of you already is.

<TimStarling> for ( $i =0 ; $i < 1000; $i++ ) { $doc = new DOMDocument; $doc->loadHTML($html); $node = $doc->documentElement->firstChild; }
<TimStarling> no need for xpath, you just need a DOMNode reference to be alive when $doc is overwritten

swtaars: awesome, go for it.

Further gdb work showed that fetching a node from a document as in the test case above created a circular reference which prevented the destructor being called, but the memory was freed on sweep. Also, fetching a few ApiHelp URLs with ab did not appear to permanently increase memory usage.

Grep indicates that the extension callers of DOMDocument::loadHTML() are CirrusSearch\BuildDocument\PageTextBuilder, TextExtracts, MobileFrontend, DonationInterface, Flow\Parsoid\Utils, CommonsMetadata and SemanticForms SF_AutoeditAPI.

Test case:

require_once '/srv/mediawiki/multiversion/MWVersion.php';
require_once getMediaWiki( 'maintenance/', 'enwiki' );

$title = Title::newMainPage();
$page = WikiPage::factory( $title );
$content = $page->getContent();
$parserOutput = $content->getParserOutput( $title );
$doc = new \Elastica\Document;
$textBuilder = new \CirrusSearch\BuildDocument\PageTextBuilder( $doc, $content, $parserOutput );

for ($i = 0; $i < 100; $i++) {

$usage = getrusage();
printf("%.2f KiB\n", $usage['ru_maxrss'] / 1024);

Invoke with --count N on the app servers.

ru_maxrss is measured in KB, so dividing it by 1024 you actually get MB not KB.

I tried running the test case under valgrind -- it reports the memory in question as "still reachable" at exit.

Here's some stuff I wrote up before seeing @ori's most recent test case:

Can you live with that cyclic reference for now? It sounds like the global leak is a much bigger problem, so I've been focusing on trying to find that. I noticed that none of the repro cases we were using seemed to reference xmlNewText(), which is the main culprit in your heap dumps. The constructor for c_DOMText looks like the main caller for that, which interestingly doesn't get called if you create a DOMText object by traversing the dom from a DOMDocument root. It does get called if the object is created manually though, and running this program with --count=100000 shows a pretty clear and enormous leak:


for ($i = 0; $i < 100000; ++$i) {
  new domtext('whoops');
var_dump(getrusage()['ru_maxrss'] / 1024);

I can't tell if that happens in the repro from @ori, but it could be the source of your leak. I'm trying out some stuff on mw1114 to confirm.

I can't find any evidence of that path getting hit; it looks like all the leaked memory is still coming from the call to htmlParseDocument in dom_load_html (I've been looking at call graphs from pprof's --pdf option).

Here's one of those graphs, from running the repro case for a while:

. 90% of the heap is from DOMDocument::loadHTML

Minimal test case:

for ($i = 0; $i < 100; $i++) {
    $doc = new DOMDocument();
    $el = $doc->getElementsByTagName('body')->item(0);
var_dump(getrusage()['ru_maxrss'] / 1024);

Hackish fix for confirmation purposes:

diff --git a/hphp/runtime/ext/ext_domdocument.cpp b/hphp/runtime/ext/ext_domdocument.cpp
index 0f4f8f1..402c113 100644
--- a/hphp/runtime/ext/ext_domdocument.cpp
+++ b/hphp/runtime/ext/ext_domdocument.cpp
@@ -2299,6 +2299,7 @@ Variant c_DOMNode::t_removechild(const Object& node) {
   while (children) {
     if (children == child) {
+      appendOrphan(*doc()->m_orphans, child);
       return create_node_object(child, doc(), true);
     children = children->next;

the whole day went on and any leak, if present, is now minor. I'd say the fix is working.

Need better workaround since the patch is segfaulting. FB may potentially fix the upstream bug, but it will take weeks or months.

I rebased Tim's fix onto current master and the small repro is segfaulting now. I took a deeper look at the code and think I might have a more robust fix. I'll update as soon as I know more.

Rebased Brett's patch to HHVM-3.3.0 and rebuilt our packages with it applied.
Brett's patch:
Rebased patch:
New package version: 3.3.0+dfsg1-1+wm2

After testing package on mw1017 and beta, I upgraded all HHVM app servers.

Crashes are still happening with the new patch, which I kinda expected. We poked at a core file for a while and it looks like it's crashing on a double free of an <a> element with these parents: <body><div id="mainpage"><table id="mp-tfa"><tr><td><p><a href="/wiki/প্রামাণ্যচিত্র">প্রামাণ্যচিত্র</a>. The document also contains headers for "featured article" and "did you know", strongly suggesting it has something to do with this code:

Repro case:

$n = 32;
while ( $n-- ) {
  	$doc = new DOMDocument();
  	$doc->loadHTML( '<html><body><div id="test"><img /></div></body></html>' );

  	$element = $doc->getElementsByTagName( 'img' )->item( 0 );
  	$element->parentNode->replaceChild( $doc->createElement( 'img' ), $element );

  	$content = $doc->createElement( 'div' );
  	$content->appendChild( $doc->getElementById( 'test' ) );

Doesn't crash for n < 32 for whatever reason.

When I run the latest repro in an asan build, it reliably yells about a double free regardless of $n. This diff applied on top of the previous patch should fix the crash, and also another leak:

Combined patch for 3.3.0:

--- a/hphp/runtime/ext/ext_domdocument.cpp
+++ b/hphp/runtime/ext/ext_domdocument.cpp
@@ -2296,6 +2296,7 @@
   while (children) {
     if (children == child) {
+      appendOrphan(*doc()->m_orphans, child);
       return create_node_object(child, doc(), true);
     children = children->next;
@@ -2357,7 +2358,11 @@
       xmlReplaceNode(oldchild, newchild);
       dom_reconcile_ns(nodep->doc, newchild);
-    return create_node_object(oldchild, doc(), false);
+    if (domnewchildnode->doc().get()) {
+      removeOrphanIfNeeded(*domnewchildnode->doc()->m_orphans, domnewchildnode->m_node);
+    }
+    return create_node_object(oldchild, doc(),
+                              oldchild->parent == nullptr);

   php_dom_throw_error(NOT_FOUND_ERR, doc()->m_stricterror);
ori raised the priority of this task from High to Needs Triage.
ori moved this task from In Dev/Progress to Done on the MediaWiki-Core-Team board.