Support SPDY
Closed, ResolvedPublic

Description

This tracks the rollout of the SPDY protocol, and possibly HTTP/2.0, across Wikimedia's production clusters.

bzimport added projects: HTTPS, Performance.Via ConduitNov 22 2014, 12:02 AM
bzimport added a subscriber: wikibugs-l.
bzimport set Reference to bz33890.
bzimport created this task.Via LegacyJan 22 2012, 9:36 PM
brion added a comment.Via ConduitJan 22 2012, 10:26 PM

Firefox 11 should include spdy, though i think disabled by default.
Some dumb questions:

  • does this require using alternate URLs to access resources, eg spdy://en.wikipedia.org/ ?
  • if so,where would we introduce such URLs other than telling people for testing?
  • isthere any facility for autoupgrading a connection?
  • is the use of multiple domains eg bits & upload an issue for spdy? would this undosome ofits ability to multiplex connections with common flow control or would tnisbe within normal expectations?
Peachey88 added a comment.Via ConduitJan 22 2012, 10:47 PM

(In reply to comment #1)

  • does this require using alternate URLs to access resources, eg spdy://en.wikipedia.org/ ?

AFAIK the client automatically does the url magic if SPDY is supported (EG: if you visit google in the newer chromes)

brion added a comment.Via ConduitJan 25 2012, 6:10 PM

I'm seeing surprisingly little user-facing documentation on this. :P

Last section of https://code.google.com/p/mod-spdy/wiki/GettingStarted seems to indicate that users should use https:// URLs at least for mod_spdy, which will serve out SPDY if clients support it and HTTPS otherwise.

What info I can find indicates that SPDY inherently runs over SSL, so I assume that http:// URLs would not 'auto-upgrade' the protocol while https:// would...?

bzimport added a comment.Via ConduitApr 17 2012, 9:11 PM

lcarr wrote:

They just released an apache mod for this... http://googledevelopers.blogspot.com/2012/04/add-spdy-support-to-your-apache-server.html

Might be worth testing out in labs, however it looks like this won't affect most of our infrastructure.

Platonides added a comment.Via ConduitApr 17 2012, 9:33 PM

Well, I think it could (negatively) affect the squid caches...

hashar added a comment.Via ConduitMar 18 2013, 10:11 AM

Moving down to very lowest priority.

McZusatz added a comment.Via ConduitJul 3 2013, 10:11 PM

(In reply to comment #4)

Might be worth testing out in labs, however it looks like this won't affect
most of our infrastructure.

any updates?

GWicke added a comment.Via ConduitOct 4 2013, 6:14 PM
  • Bug 54986 has been marked as a duplicate of this bug. ***
yuvipanda added a comment.Via ConduitOct 4 2013, 6:15 PM

I'm going to add support for this into the dynamicproxy on labs, and see how that goes.

GWicke added a comment.Via ConduitOct 4 2013, 6:15 PM

From bug 33890:

SPDY is supported in Nginx and most modern browsers. Since we are using Nginx
for HTTPS termination already, we should consider enabling SPDY support in it.
This cuts down the overhead per request, which in turn makes it feasible to
de-bundle in particular API requests for less cache fragmentation.

GWicke added a comment.Via ConduitOct 4 2013, 9:10 PM

Upping priority as this is something we can do with Nginx without affecting Varnish or anything else in the backend.

yuvipanda added a comment.Via ConduitOct 5 2013, 5:52 PM

https://gerrit.wikimedia.org/r/#/c/87682/ enables this for the labs dynamicproxy.

yuvipanda added a comment.Via ConduitOct 7 2013, 12:24 AM

That patch has been merged, and dynamicproxy has spdy enabled! \o/

http://spdycheck.org/#pinklake.wmflabs.org

This provides support for spdy/2, which is what nginx supports so far. It is a newer build of nginx packaged for this project (by andrewbogott), and I guess it can eventually be used in production too.

GWicke added a comment.Via ConduitJan 17 2014, 9:18 PM

SPDY is an optimization that a lot of our clients can profit from now: http://caniuse.com/spdy

Especially clients on high-latency and low-bandwidth links (mobile, developing countries) benefit from header compression and avoidance of TCP slow-start on concurrent connections.

Are there specific issues that prevent us from applying the simple config change in [1] for a subset of the production traffic for testing?

[1]: https://gerrit.wikimedia.org/r/#/c/87682/2/modules/dynamicproxy/templates/proxy.conf

ori added a comment.Via ConduitJan 17 2014, 9:22 PM

(In reply to comment #14)

Are there specific issues that prevent us from applying the simple config
change in [1] for a subset of the production traffic for testing?

[1]:
https://gerrit.wikimedia.org/r/#/c/87682/2/modules/dynamicproxy/templates/
proxy.conf

Yeah -- the packaged build of Nginx that we are running in production was not compiled with the requisite flag for SPDY support. But your broader point is correct: it is pretty simple to do, and there is no compelling reason not to do it, AFAIK.

RyanLane added a comment.Via ConduitJan 17 2014, 9:26 PM

(In reply to comment #15)

(In reply to comment #14)
> Are there specific issues that prevent us from applying the simple config
> change in [1] for a subset of the production traffic for testing?
>
> [1]:
> https://gerrit.wikimedia.org/r/#/c/87682/2/modules/dynamicproxy/templates/
> proxy.conf

Yeah -- the packaged build of Nginx that we are running in production was not
compiled with the requisite flag for SPDY support. But your broader point is
correct: it is pretty simple to do, and there is no compelling reason not to
do
it, AFAIK.

Except that our infrastructure isn't really set up in an ideal way to use SPDY. We use multiple endpoints, for instance. It also complicates our SSL plans. Can you use a shared SSL cache? Does it properly support forward secrecy (can we roll the keys? we can't properly with nginx). Etc. etc. etc.

It's not as simple as you're making it out to be.

Catrope added a comment.Via ConduitJan 17 2014, 9:26 PM

(In reply to comment #15)

Yeah -- the packaged build of Nginx that we are running in production was not
compiled with the requisite flag for SPDY support. But your broader point is
correct: it is pretty simple to do, and there is no compelling reason not to
do
it, AFAIK.

IIRC SPDY's header compression is vulnerable to the CRIME/BREACH family of attacks. HTTP 2.0 is going to use a different header compression technique for that reason, but implementations of HTTP 2.0 aren't really done yet AFAIK. If we deploy SPDY, we should turn off header compression.

GWicke added a comment.Via ConduitJan 17 2014, 10:10 PM

@ori, ah, I see. I guess a backport or the next Ubuntu LTS upgrade (three months from now?) could help here.

@faidon: Is the main complication becoming tied to nginx? With SPDY being an application-level protocol I would be surprised if it affected TLS layer issues.

@Roan: According to [1] "The nginx web-server was not vulnerable to CRIME since 1.0.9/1.1.6 (October/November 2011) using OpenSSL 1.0.0+, and since 1.2.2/1.3.2 (June / July 2012) using all versions of OpenSSL". Disabling header compression sounds like a prudent measure though. Even without header compression SPDY saves bandwidth by avoiding re-sending identical headers for each request.

According to [2] BREACH is not specific to SPDY; it rather applies to all uses of TLS.

[1]: https://en.wikipedia.org/wiki/CRIME_(security_exploit)#Mitigation
[2]: https://en.wikipedia.org/wiki/Transport_Layer_Security#CRIME_and_BREACH_attacks

GWicke added a comment.Via ConduitJan 17 2014, 11:11 PM

@Ryan: My response directed to Faidon was really for you, in case that wasn't clear from the context.

Catrope added a comment.Via ConduitJan 17 2014, 11:22 PM

(In reply to comment #18)

@Roan: According to [1] "The nginx web-server was not vulnerable to CRIME
since
1.0.9/1.1.6 (October/November 2011) using OpenSSL 1.0.0+, and since
1.2.2/1.3.2
(June / July 2012) using all versions of OpenSSL". Disabling header
compression
sounds like a prudent measure though. Even without header compression SPDY
saves bandwidth by avoiding re-sending identical headers for each request.

Which I believe is what HTTP 2.0 does too: only send modified headers, but don't compress their contents.

GWicke added a comment.Via ConduitApr 1 2014, 10:07 PM

So I guess this is waiting for a more up to date version of nginx at this point.

(In reply to Ryan Lane from comment #16)

> Yeah -- the packaged build of Nginx that we are running in production was not
> compiled with the requisite flag for SPDY support. But your broader point is
> correct: it is pretty simple to do, and there is no compelling reason not to
> do
> it, AFAIK.

Except that our infrastructure isn't really set up in an ideal way to use
SPDY. We use multiple endpoints, for instance.

Do you mean us using several domains? Even just using it for the API and bits would already be a good step forward. We can optimize further down the road.

Can you use a shared SSL cache?

Can you describe what this is about?

Does it properly support forward secrecy (can we roll the keys? we can't properly with nginx). Etc. etc. etc.

As far as I can tell nginx supports forward secrecy. Can you describe the issue that you see with key rolling?

RyanLane added a comment.Via ConduitApr 1 2014, 10:29 PM

(In reply to Gabriel Wicke from comment #21)

So I guess this is waiting for a more up to date version of nginx at this
point.

(In reply to Ryan Lane from comment #16)
> > Yeah -- the packaged build of Nginx that we are running in production was not
> > compiled with the requisite flag for SPDY support. But your broader point is
> > correct: it is pretty simple to do, and there is no compelling reason not to
> > do
> > it, AFAIK.
>
> Except that our infrastructure isn't really set up in an ideal way to use
> SPDY. We use multiple endpoints, for instance.

Do you mean us using several domains? Even just using it for the API and
bits would already be a good step forward. We can optimize further down the
road.

> Can you use a shared SSL cache?

Can you describe what this is about?

Sure. Currently we use source hash in LVS to ensure a client always hits the same frontend SSL server to ensure they always reuse the SSL session. This works ok, but often leads to bugs. For instance, the monitoring servers don't always detect errors, because they are always hitting the same nodes, subsets of users see a problem while others, including us, don't, etc.. Also, it's not possible to weight the load of different servers while using source hash, so we'd really like to switch to weighted round robin.

Part of that is supporting an SSL session cache that spans the SSL nodes. Apache has support for this. Stud has support for this. Nginx does not, so we were considering switching away or adding it. Adding SPDY into this may change things. Does it use the same cache as HTTPS? If we switch to something other than Nginx, will it support SPDY?

Either way, this is actually a more important problem to solve than SPDY, currently.

> Does it properly support forward secrecy (can we roll the keys? we can't properly with nginx). Etc. etc. etc.

As far as I can tell nginx supports forward secrecy. Can you describe the
issue that you see with key rolling?

It's actually handled by openssl on initialization. So, until you restart nginx the key isn't actually rotated. This is also a concern with using weighted round robin load balancing, since we'd need to ensure the rotated key is rotated across all the nodes as well. Apache keeps the rotated key on the filesystem (and recommends using shared memory for this), so this is less of a problem with Apache, but we have no way of handling this with nginx, except for restarting the servers, which sucks.

Anyway, we can likely ignore forward secrecy for now since it's basically worthless considering we're very vulnerable to traffic analysis.

GWicke added a comment.Via ConduitApr 2 2014, 10:27 PM

(In reply to Ryan Lane from comment #22)

(In reply to Gabriel Wicke from comment #21)
> > Can you use a shared SSL cache?
>
> Can you describe what this is about?
>

Sure. Currently we use source hash in LVS to ensure a client always hits the
same frontend SSL server to ensure they always reuse the SSL session. This
works ok, but often leads to bugs. For instance, the monitoring servers
don't always detect errors, because they are always hitting the same nodes,
subsets of users see a problem while others, including us, don't, etc..
Also, it's not possible to weight the load of different servers while using
source hash, so we'd really like to switch to weighted round robin.

Part of that is supporting an SSL session cache that spans the SSL nodes.
Apache has support for this. Stud has support for this. Nginx does not, so
we were considering switching away or adding it. Adding SPDY into this may
change things. Does it use the same cache as HTTPS? If we switch to
something other than Nginx, will it support SPDY?

SPDY sets up a single TCP connection per host (really IP and cert) and then multiplexes all requests over that connection. My understanding of the way we use LVS is that all incoming traffic for a given TCP connection is forwarded to the same backend server even in round robin mode. The single SPDY connection would end up talking to the same backend all the time. So moving towards SPDY might actually reduce the need for a shared SSL cache for most connections.

RyanLane added a comment.Via ConduitApr 2 2014, 10:40 PM

(In reply to Gabriel Wicke from comment #23)

(In reply to Ryan Lane from comment #22)
> (In reply to Gabriel Wicke from comment #21)
> > > Can you use a shared SSL cache?
> >
> > Can you describe what this is about?
> >
>
> Sure. Currently we use source hash in LVS to ensure a client always hits the
> same frontend SSL server to ensure they always reuse the SSL session. This
> works ok, but often leads to bugs. For instance, the monitoring servers
> don't always detect errors, because they are always hitting the same nodes,
> subsets of users see a problem while others, including us, don't, etc..
> Also, it's not possible to weight the load of different servers while using
> source hash, so we'd really like to switch to weighted round robin.
>
> Part of that is supporting an SSL session cache that spans the SSL nodes.
> Apache has support for this. Stud has support for this. Nginx does not, so
> we were considering switching away or adding it. Adding SPDY into this may
> change things. Does it use the same cache as HTTPS? If we switch to
> something other than Nginx, will it support SPDY?

SPDY sets up a single TCP connection per host (really IP and cert) and then
multiplexes all requests over that connection. My understanding of the way
we use LVS is that all incoming traffic for a given TCP connection is
forwarded to the same backend server even in round robin mode. The single
SPDY connection would end up talking to the same backend all the time. So
moving towards SPDY might actually reduce the need for a shared SSL cache
for most connections.

If the connection is broken and a new connection is needed, it'll likely hit another server when using round robin, which means an SSL cache miss. This is pretty common with mobile clients, which is where things matters the most.

GWicke added a comment.Via ConduitApr 3 2014, 12:06 AM

(In reply to Ryan Lane from comment #24)

If the connection is broken and a new connection is needed, it'll likely hit
another server when using round robin, which means an SSL cache miss. This
is pretty common with mobile clients, which is where things matters the most.

The difference is that it happens less often with SPDY, as a single connection is going to remain busy & kept alive for longer, and you only pay the setup cost for one connection rather than 6 or so otherwise.

How long is the SSL cache normally kept around / valid?

faidon added a comment.Via ConduitApr 3 2014, 8:10 AM

Let's take a step back: SSL's scaling & performance requirements and SPDY are not things that can be discussed effectively in a BZ bug, I think. There's a lot of work involved, some of which is documented under https://wikitech.wikimedia.org/wiki/HTTPS/Future_work and other that is not (SPDY).

There is going to be most likely a quarterly SSL/SPDY goal with multiple people involved as it spans multiple layers, involves some low-level C coding, has cross-team dependencies etc. It's possible it may even span more than a quarter — there is a lot of work needed to have a properly functioning, scalable infrastructure.

I think it's unlikely it's going to be in this coming quarter's goals, but the priorities have not been set yet so nothing's definite — Gabriel, Ori, Roan and others you're very much welcome to provide input to this process as it relates to your team's goals (SOA, performance etc.) as it would certainly help us prioritize it more effectively.

Such a project will result into multiple bug reports/RT issues and leaving this open as a placeholder and master ticket is fine IMHO. I just don't think we can effectively have such a large discussion here.

GWicke added a comment.Via ConduitApr 18 2014, 4:04 PM

@faidon: I agree that a wider discussion is needed to come to a conclusion & make a plan / agree on priorities. Lets use this bug to collect more information for now to inform that discussion.

Nginx lets you specific keepalive timeouts separately for HTTPS? vs. SPDY connections. See keepalive_timeout and spdy_keepalive_timeout. With only a single connection used for SPDY the keepalive can be set significantly higher than the 65s default for HTTPS? without resulting in an excessive number of connections. Combined with around 65% of requests already supporting SPDY [1] this might reduce the need for SSL session caching somewhat.

Also potentially relevant is http://tools.ietf.org/html/rfc5077, with an implementation discussed in http://vincent.bernat.im/en/blog/2011-ssl-session-reuse-rfc5077.html#sharing-tickets. Sadly Safari and old IE versions don't support it, with Safari being the main non-SPDY hold-out. According to https://www.ssllabs.com/ssltest/viewClient.html?name=IE&version=11&platform=Win%208.1 IE 11 does support session tickets.

[1]: http://caniuse.com/spdy

GWicke added a comment.Via ConduitApr 18 2014, 6:08 PM

Another bit of info re keep-alives from https://groups.google.com/forum/#!topic/spdy-dev/xgyPztsAKls:

FF keeps SPDY connections alive for 3 minutes using PING frames. Servers can also keep connections alive using PING frames. Not sure if any implementations do that currently. On mobile pings do have a battery cost.

GWicke added a comment.Via ConduitMay 2 2014, 6:59 PM

nginx 1.6 now has SPDY 3.1 support.

GWicke added a comment.Via ConduitJun 5 2014, 8:51 PM

Good news: The last browser hold-out (Safari) will finally get SPDY support as well:

"Safari supports the latest web standards, including WebGL and SPDY"
http://www.apple.com/pr/library/2014/06/02Apple-Announces-OS-X-Yosemite.html

This should soon increase browser support beyond the 67% currently claimed on http://caniuse.com/spdy.

faidon added a comment.Via ConduitJun 6 2014, 2:01 PM

That's good news indeed.

Of course that 67% figure is bogus, as it assumes SPDY is one protocol, while there are four versions available and browsers & servers each support a different combination of these. So, for example, nginx 1.5.10+ advertises only SPDY 3.1, which is only supported by Firefox 27+, Chrome 28+ etc. Similarly, SPDY/2 support was removed from Firefox 28/Chrome 33, so it's not like a server can stick to a previous version.

This is partially alleviated by automatic browser updates, but it's hardly the same as having that table be predominantly "green" and saying "67% of the browsers support it" :)

MZMcBride added a comment.Via ConduitJun 17 2014, 5:01 AM

(In reply to Faidon Liambotis from comment #26)

There's a lot of work involved, some of which is documented under
https://wikitech.wikimedia.org/wiki/HTTPS/Future_work and other that is not
(SPDY).

There is going to be most likely a quarterly SSL/SPDY goal with multiple
people involved as it spans multiple layers, involves some low-level C
coding, has cross-team dependencies etc. It's possible it may even span more
than a quarter — there is a lot of work needed to have a properly
functioning, scalable infrastructure.

What work is needed? I re-read this ticket (and looked for dependency tickets) and re-skimmed [[wikitech:HTTPS/Future work]], but I'm still unclear on what work is needed to support SPDY on Wikimedia wikis.

Nemo_bis added a comment.Via ConduitJun 17 2014, 7:15 AM

https://www.mediawiki.org/wiki/Wikimedia_MediaWiki_Core_Team/Backlog#SPDY_.2F_HTTP_2.0 is empty as well and all https://www.mediawiki.org/?oldid=975972 says is "Would help with asset delivery, front-end perf".

JanZerebecki added a comment.Via ConduitJun 17 2014, 9:31 PM

I don't think there is anything directly blocking SPDY except upgrading nginx (or switching to something else) and testing it. Both are quite a bit of work in themselves. After that it will be usefull to analyse if enabling it actually helps.

However I also think that it doesn't make sense to upgrade nginx only to then decide to switch to apache because one of the other items from [[wikitech:HTTPS/Future work]] require it. So one implementation needs to be selected and the necessary missing features (like e.g. an distributed session cache for nginx) need to be coded. (Not necessarily in that order. There are a few more variants on the wiki page I ommited for brevity, basically more informed decisions need to be made. All interesting stuff, wish I could spend more time on it.)

faidon changed the title from "support SPDY protocol" to "Support SPDY".Via WebJan 13 2015, 3:04 PM
faidon edited the task description. (Show Details)
faidon added a project: ops-core.
faidon set Security to None.
konklone added a subscriber: konklone.Via WebJan 15 2015, 4:38 AM

This seems like a smart thing to prioritize for the HTTPS-by-default tag, since it has such drastic front-end speed improvements for multiplexing resources. I've never managed an infrastructure like Wikipedia's, but the SPDY module for nginx has shipped for a while and is very easy to turn on.

faidon added a comment.Via WebJan 15 2015, 6:25 AM

We've made a conscious decision to prioritize our HTTPS scalability work and turn on SPDY (or rather, HTTP/2.0) very shortly after. You could argue it's part of the same series of steps or a separate step, but in the end, it doesn't really matter; what does matter, is that it's happening right after this work.

Chmarkine added a subscriber: Chmarkine.Via WebJan 20 2015, 8:36 PM
GWicke added a comment.Via WebFeb 9 2015, 9:21 PM

The new jessie nginx test install already supports SPDY, and I believe is serving a fraction of the prod traffic: https://spdycheck.org/#cp1008.wikimedia.org

So it looks like we'll gradually get wider SPDY support as the Jessie Nginx installs are being rolled out.

Tony_Tan_98 added a subscriber: Tony_Tan_98.Via WebMar 8 2015, 10:16 AM
faidon closed this task as "Resolved".Via WebMar 14 2015, 12:32 PM
faidon claimed this task.
faidon added a subscriber: BBlack.

@BBlack tackled this while implementing T86648. All HTTP frontends are now running an newer stack and have SPDY enabled. There is a number of subsequent performance enhancements that we can implement because of this (e.g. -somehow- undo our domain sharding, or move to the same service IP + certificate) but we'll track these separately.

Nemo_bis added projects: notice, user-notice.Via WebMar 20 2015, 4:56 PM
gpaumier moved this task to Announce in next Tech/News on the user-notice workboard.Via WebMar 23 2015, 2:50 PM
gpaumier moved this task to Triaged on the notice workboard.Via WebMar 25 2015, 10:00 PM
gpaumier edited the task description. (Show Details)Via WebMar 26 2015, 11:28 PM
gpaumier added a subscriber: gpaumier.

(Added a link to the English Wikipedia article for people who come here from Tech News and don't know what SPDY is.)

gpaumier moved this task to In current Tech News draft on the user-notice workboard.Via WebMar 26 2015, 11:32 PM
Ricordisamoa added a subscriber: Ricordisamoa.Via WebMar 30 2015, 4:46 PM
gpaumier moved this task to Recently announced in Tech/News on the user-notice workboard.Via WebMar 30 2015, 8:15 PM
gpaumier moved this task to Archive on the user-notice workboard.Via WebApr 3 2015, 8:49 PM
konklone removed a subscriber: konklone.Via WebApr 3 2015, 10:36 PM
gpaumier moved this task to Archive on the notice workboard.Via WebApr 9 2015, 5:45 PM

Add Comment