Page MenuHomePhabricator

HHVM forks (with 4 gb of copied memory!) stuck in a futex
Closed, InvalidPublic

Description

Ori noticed mw1114 was running more than one HHVM process, so I investigated.

Turns out this is some weird thing with HHVM spawning child processes:

$ ps -C hhvm -o pid,ppid,cmd
  PID  PPID CMD
 1027 12497 /usr/bin/hhvm --config /etc/hhvm/fcgi.ini --mode server
12497     1 /usr/bin/hhvm --config /etc/hhvm/fcgi.ini --mode server
25988 12497 /usr/bin/hhvm --config /etc/hhvm/fcgi.ini --mode server

So those are effectively two child processes of the main process that
upstart launches.

Both children are simply waiting on the same futex:

futex(0x7fce03707990, FUTEX_WAIT_PRIVATE, 2, NULL,...

So, child processes getting stuck on the same futex seemed like a
classical case of a fork() race condition, about which I recently read
on rachelbythebay:

https://rachelbythebay.com/w/2014/08/16/forkenv/

Looking at the stack traces obtained with quickstack, both processes
seem like threads gone wild. Since we had no RAM left to get a full
stack trace with gdb, I plainly killed one of them (also, it's still
sunday morning after all, and we're running thin on memory as this
processes apparently have all the main process memory copied over);
this really seems like some kind of bug in HHVM that we should care
about - and probably ask our Facebook friends about if we can't figure
out more.

The first action we should probably do is to make the alarm on hhvm
processes in nagios to be more restrictive (and only catch the
--config fastcgi ones, for instance), so that we don't just see this
because you happened to run "ps"

The stack traces are in /root/children_of_hhvm on mw1114

Event Timeline

Joe created this task.Nov 9 2014, 9:40 PM
Joe raised the priority of this task from to High.
Joe updated the task description. (Show Details)
Joe changed Security from none to None.
Joe updated the task description. (Show Details)
Joe added subscribers: Joe, ori.
Joe updated the task description. (Show Details)Nov 9 2014, 9:43 PM
ori added a subscriber: swtaarrs.Nov 9 2014, 11:00 PM
ori added a subscriber: tstarling.Nov 10 2014, 2:02 AM
ori added a comment.EditedNov 10 2014, 2:41 AM

I attach gdb to the hung child process, 1027. It's stuck on pthread_mutex_lock(). I examine the mutex and determine that its owner is thread 28686:

(gdb) bt
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007fcdfc2f3672 in _L_lock_953 () from /lib/x86_64-linux-gnu/libpthread.so.0
#2  0x00007fcdfc2f34da in __GI___pthread_mutex_lock (mutex=0x7fce03707990 <_rtld_global+2352>) at ../nptl/pthread_mutex_lock.c:114
[...]

(gdb) frame 2
#2  0x00007fcdfc2f34da in __GI___pthread_mutex_lock (mutex=0x7fce03707990 <_rtld_global+2352>) at ../nptl/pthread_mutex_lock.c:114

(gdb) print mutex.__data.__owner
$2 = 28686

I detach and attach to the parent process, 12497. I call info threads to locate thread 28686 and get its backtrace:

(gdb) info threads
  Id   Target Id         Frame
[snipped]
  13   Thread 0x7fca9afff700 (LWP 28686) "hhvm" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185

(gdb) thread 13
[Switching to thread 13 (Thread 0x7fca9afff700 (LWP 28686))]
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185

(gdb) bt
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00000000015cc5f1 in waitImpl (ts=0x0, front=false, q=1520352832, id=id@entry=-205240784, this=0x7fcdf3d7aa68, this@entry=0x7fcdf3d7aad0) at /home/joe/HHVM/hhvm/hphp/util/synchronizable-multi.cpp:79
#2  HPHP::SynchronizableMulti::wait (this=this@entry=0x7fcdf3d7aa68, id=id@entry=88, q=q@entry=0, front=front@entry=false) at /home/joe/HHVM/hhvm/hphp/util/synchronizable-multi.cpp:43
#3  0x0000000000b6383b in HPHP::JobQueue<std::shared_ptr<HPHP::FastCGIJob>, false, HPHP::JobQueueDropVMStack>::dequeueMaybeExpiredImpl (this=this@entry=0x7fcdf3d7aa68, id=id@entry=88, q=q@entry=0, inc=inc@entry=true, now=..., expired=expired@entry=0x7fca9affb950) at /home/joe/HHVM/hhvm/hphp/util/job-queue.h:256
#4  0x0000000000b63b8c in dequeueMaybeExpired (inc=true, expired=0x7fca9affb950, q=0, id=88, this=0x7fcdf3d7aa68) at /home/joe/HHVM/hhvm/hphp/util/job-queue.h:169
#5  HPHP::JobQueueWorker<std::shared_ptr<HPHP::FastCGIJob>, HPHP::Server*, true, false, HPHP::JobQueueDropVMStack>::start (this=0x7fcb99b41c40) at /home/joe/HHVM/hhvm/hphp/util/job-queue.h:457
#6  0x00000000015b1ae7 in threadFuncImpl (this=0x7fcccaace100) at /home/joe/HHVM/hhvm/hphp/util/async-func.cpp:131
#7  HPHP::AsyncFuncImpl::ThreadFunc (obj=0x7fcccaace100) at /home/joe/HHVM/hhvm/hphp/util/async-func.cpp:51
#8  0x00007fcdfc2f1182 in start_thread (arg=0x7fca9afff700) at pthread_create.c:312
#9  0x00007fcdfb7fdfbd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

..and then it's dinner time and I have to run.


I don't know what "threads gone wild" is meant to mean. When I checked, there were only 106 threads running, with hhvm.server.thread_count = 100.

Anyway, in process 1027, jemalloc is trying to collect a backtrace for heap profiling, so I guess the lock was corrupted by the fork. So if you don't want this to happen, disable heap profiling. It seems unlikely that you would see anything interesting in the parent process, long after the completion of the heap profile sample.

ori added a comment.Nov 10 2014, 4:43 AM

I had an /etc/malloc.conf that enabled heap profiling on mw1114, left over from debugging T820.

Joe closed this task as Invalid.Nov 10 2014, 5:53 AM
bd808 moved this task from Backlog to Done on the MediaWiki-Core-Team board.Nov 10 2014, 5:32 PM
bd808 moved this task from Done to Archive on the MediaWiki-Core-Team board.Nov 25 2014, 8:42 PM