HHVM with FastCGI does not support streaming output
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	tstarling
	Mar 3 2015, 11:36 PM

Description

As discovered during investigation of T89918, HHVM does not support streaming output in any mode except CLI.

Related Objects
Search...

View Standalone Graph

This task is connected to more than 200 other tasks. Only direct parents and subtasks are shown here. Use View Standalone Graph to show more of the graph.

Status	Subtype	Assigned	Task
			· · ·
Resolved	PRODUCTION ERROR	• demon	T89918 OOM reported at SpecialPage.php:534 due to large output from Special:Book
Resolved		Joe	T84842 Convert eqiad imagescalers to HHVM, Trusty
Resolved		tstarling	T91468 HHVM with FastCGI does not support streaming output
			· · ·

Event Timeline

tstarling created this task.Mar 3 2015, 11:36 PM

tstarling raised the priority of this task from to High.

tstarling updated the task description. (Show Details)

tstarling added projects: HHVM, MediaWiki-Core-Team.

tstarling added subscribers: • MZMcBride, • Manybubbles, tstarling and 5 others.

tstarling claimed this task.Mar 4 2015, 1:05 AM

jeremyb subscribed.Mar 5 2015, 5:42 AM

The output functions in execution-context.cpp can be made to stream their output to the transport, I have a patch for that. But then the transport allocates an unbounded amount of memory and eventually either segfaults (presumably due to malloc failure) or gets killed by oom-killer.

It seems that the relevant interfaces do not support streaming writes. The abstract parent class AsyncTransportWrapper has a write() method which promises to either send the buffer immediately or queue infinite amounts of data. It does not even keep track of how much data or how many blocks its write queue owns. It unconditionally takes ownership of the caller's buffer, and has no interface to return ownership to the caller in the event of a buffer overrun.

FastCGISession does keep track of the number of async requests that are pending (m_eventCount), but there is no synchronisation infrastructure which would make it easy to wait for WriteCallback::writeSuccess() to be called on buffer overrun.

Also, FastCGISession calls EventBase::runInEventBaseThread() for each write request, in order to pass the write buffers to the FastCGI server thread. This again uses an unbounded queue. A separate FastCGI server thread is responsible for writing. Even if the FastCGI server thread could detect a buffer overrun and wait, it has no way to block the main app thread.

Legoktm subscribed.Mar 5 2015, 6:15 AM

tstarling updated the task description. (Show Details)Mar 6 2015, 5:27 AM

tstarling set Security to None.

This goes into the too-hard basket for now. Maybe after the HHVM devs are back online after their "infra summit", we can talk about strategies.

bd808 added a project: Upstream.Mar 6 2015, 4:15 PM

I have had a chat with sgolemon, paulbiss and afrind, and it's back into the "maybe possible" pile.

It seems to me that emulating blocking I/O is properly a responsibility of the I/O library, but there is a precedent for emulating it in the FastCGI layer. Blocking reads are emulated in FastCGITransport::getMorePostData() using a mutex local to the transport. So blocking writes could be done in the same way.

There is the question of why the VM thread is handing off its writes to a single-threaded async server instead of having an acceptor thread grant ownership of the connection to the VM thread, which would allow a FastCGI implementation with true blocking I/O. The rationale is apparently to allow VM threads to end the request and move on to something else, while the output is still being sent. So you can have a large connection limit and a small VM thread pool, minimising memory usage. The benefit for us is pretty small, since we have Varnish buffering responses, but for users with HHVM directly facing the web, it could be significant.

The current plan is for me to flesh out the design a bit more and assess whether I can do this myself.

Joe subscribed.Mar 10 2015, 7:17 AM

bd808 mentioned this in T89918: OOM reported at SpecialPage.php:534 due to large output from Special:Book.Mar 12 2015, 2:10 AM

tstarling moved this task from Backlog to In Dev/Progress on the MediaWiki-Core-Team board.Mar 12 2015, 5:23 AM

• Gilles subscribed.Mar 13 2015, 4:09 PM

• Gilles mentioned this in T89114: "show thumbnail" option disappears once file is uploaded.Mar 13 2015, 4:33 PM

MusikAnimal subscribed.Mar 13 2015, 4:36 PM

bd808 added a parent task: T84842: Convert eqiad imagescalers to HHVM, Trusty.Mar 13 2015, 5:39 PM

bd808 mentioned this in T84842: Convert eqiad imagescalers to HHVM, Trusty.Mar 13 2015, 5:42 PM

JanZerebecki subscribed.Mar 13 2015, 5:58 PM

Tgr mentioned this in T90599: Displaying the stashed file as a thumbnail in UploadWizard fails with an 503.Mar 13 2015, 6:10 PM

Steinsplitter subscribed.Mar 13 2015, 6:34 PM

Matanya subscribed.Mar 17 2015, 6:12 PM

Here is the patch as it stands:

output-streaming.patch12 KBDownload

It seems to work. I've tried a print loop and readfile('/dev/zero'), with and without ob_implicit_flush(), it does not use a significant amount of memory. I'm now rebuilding in release mode for performance testing. I'm not optimistic about performance, but it may be good enough for us as a stopgap even if it would be rejected upstream.

Well, strangely enough, the patched version is 1.7 times faster on my laptop when streaming a 1MB file with readfile(). I thought that lower system bus traffic might give it an advantage, but I didn't expect it to overcome the synchronisation overhead that I added. So maybe it will be acceptable upstream after all.

Smalyshev subscribed.Mar 18 2015, 5:28 AM

@tstarling did you test the patch against the development HEAD of HHVM? I'll try seeing if it applies to our version (3.3.x) as the package needed refreshing anyways. I'll let you know if I need help backporting it.

I think it would take a lot of work to backport it, since the relevant code was severely refactored in February. I think the simplest way to do it would be to backport those changes also, and any other recent changes in hphp/runtime/server/fastcgi . The transport/server interface is a plugin boundary -- Facebook have a closed-source server implementation which connects to it, so the API should be pretty stable. Note that the commit I linked above only changes files in the fastcgi directory.

Another option would be to upgrade the whole of HHVM.

I think we should upgrade HHVM to 3.6 as soon as possible when it comes out. It will make 95% of our patches redundant, a fact that will make me very happy :)

Given 3.6 should be out in a couple of weeks, I can wait for it as far as coverting the imagescalers is concerned. If we can live with the other issues we have for around one month, I would wait for that, but let's see what the feedback will be.

As I reported at T89918, I have had a closer look at the 5xx logs, and they suggest that there is not a huge amount of urgency to pushing out this patch.

How difficult would it be to apply the patch to testwiki? I'm still actively investigating T90599 which might be caused by this. It's reproducible on testwiki. If we can confirm with certainty that it's this hhvm bug causing it, it would save me a lot of time spent trying to figure out the upload stash issue and it would allow us to give a clear ETA to Commons users.

@Gilles I just found out that HHVM 3.6.0 has been released the day after I last checked. So I will devote some time to building a new package in the next few days

Joe mentioned this in T93194: Create an HHVM 3.6.0 package, adding Tim's streaming patch.Mar 19 2015, 12:09 PM

Joe mentioned this in rODAZ2c307c03d889: Add support for output streaming.Mar 23 2015, 3:51 PM

Joe mentioned this in rODAZd0e01e86a227: Add support for output streaming.Apr 1 2015, 10:56 AM

Liuxinyu970226 subscribed.Apr 3 2015, 5:04 AM

bd808 removed a project: MediaWiki-Core-Team.Apr 8 2015, 10:32 PM

With segfault fix:

output-streaming-v2.patch12 KBDownload

Segfault fix only:

diff --git a/hphp/runtime/base/execution-context.cpp b/hphp/runtime/base/execution-context.cpp
index ee2a793..11146ba 100644
--- a/hphp/runtime/base/execution-context.cpp
+++ b/hphp/runtime/base/execution-context.cpp
@@ -315,6 +315,10 @@ void ExecutionContext::obClean(int handler_flag) {
 bool ExecutionContext::obFlush() {
   assert(m_protectedLevel >= 0);
 
+  if (m_buffers.empty()) {
+    return false;
+  }
+
   auto iter = m_buffers.end();
   OutputBuffer& last = *(--iter);

In T91468#1131561, @Joe wrote:

HHVM 3.6.0 has been released the day after I last checked. So I will devote some time to building a new package in the next few days

@Joe: Has that happened in the meantime (and what does that mean for this task)?

This seems to be the patch from above https://phabricator.wikimedia.org/diffusion/ODAZ/browse/master/debian/patches/output-streaming.patch which thus means that is included what we are running on CI and thus I assume also on production (didn't check). (So yes it has happened) Which means in the context of the blocked tasks this needs to be verified to fix the issues. Also someone needs to try to get this upstream or at least have an upstream bug open which references our not upstreamable solution.

Ricordisamoa subscribed.Jul 14 2015, 11:48 AM

In T91468#1131561, @Joe wrote:

HHVM 3.6.0 has been released the day after I last checked. So I will devote some time to building a new package in the next few days

@Joe: Has that happened in the meantime (and what does that mean for this task)?

@Aklapper it was resolved, as T93194 testifies

Joe closed this task as Resolved.Aug 24 2015, 12:57 PM

Liuxinyu970226 unsubscribed.Aug 25 2015, 2:24 AM

faidon mentioned this in T119637: Update HHVM package to recent release.Dec 16 2015, 5:52 PM

alex-mashin subscribed.Mar 9 2016, 4:43 AM

	F99689: output-streaming.patch
	Mar 18 2015, 3:12 AM

HHVM with FastCGI does not support streaming outputClosed, ResolvedPublicActions

Description

Related ObjectsSearch...

Event Timeline

HHVM with FastCGI does not support streaming output
Closed, ResolvedPublic
Actions

Related Objects
Search...