Page MenuHomePhabricator

Extra RTT on TLS handshakes
Closed, ResolvedPublic

Description

Lately there's an extra RTT on our initial TLS handshakes on the cache terminators. Likely this is a result of a change to OpenSSL (1.0->1.1) and/or nginx, and/or our configuration. This was originally reported by @GWicke as a regression in webpagtest results comparing production to the labs proxy and to production in the past. What I know from my own testing is this:

  1. I can reproduce it both with Chrome and curl on Linux, but the easiest thing to test with is openssl s_client.
  2. It's sensitive to the data size of the server's handshake response (which is largely determined by certificate chain and OCSP staple sizes), and the critical value is somewhere in the 4K-ish ballpark as reported by openssl s_client's SSL handshake has read N bytes.
  3. You can manipulate the handshake size with s_client (for testing purposes) by changing the client's -cipher (ECDHE vs DHE vs non-FS, ECDSA vs RSA for the auth alg, etc) and by asking for stapling with -status (or not).
  4. Adding 1000ms of artificial network delay on your local machine makes it far easier to test for the extra RTT, as an extra second doesn't get lost in various other noise. I've been adding it to my wifi with: tc qdisc add dev wlp2s0 root netem delay 1000ms (and then s/add/del/ to revert it).

These are some test results from my first round of testing, against a "normal" software stack and config on cp1008:

Server BytesExtra RTTCipherStapling
3238NoECDHE-ECDSA-AES128-GCM-SHA256No
3270NoECDHE-ECDSA-AES128-SHANo
3344NoAES128-SHANo
3625NoECDHE-RSA-AES128-GCM-SHA256No
3657NoECDHE-RSA-AES128-SHANo
4132NoDHE-RSA-AES128-SHANo
4867YesECDHE-ECDSA-AES128-GCM-SHA256Yes
4899YesECDHE-ECDSA-AES128-SHAYes
4974YesAES128-SHAYes
5255YesECDHE-RSA-AES128-GCM-SHA256Yes
5287YesECDHE-RSA-AES128-SHAYes
5762YesDHE-RSA-AES128-SHAYes

Basically the extra RTT is always when Stapling is added, but there's also a 735 byte gap in the sizes I can test there (stapling is relatively large!), so it could also be that the critical size just happens to land in that window between 4132-4867 (again, as reported by s_client). The extra-RTT boundary being in that range also sounds an awful lot like a lack of IW10 (which would show a limit of 4380).

In any case, the first thing to see is whether we can artificially make a smaller stapled response to prove that the stapling feature doesn't trigger the issue regardless of size. So I re-tested the smallest result above but without its intermediate certificate sent to reduce the size:

3737NoECDHE-ECDSA-AES128-GCM-SHA256Yes(intermediate cert not sent, to prove we can staple without extra RTT at all)

The problem looks a lot like https://trac.nginx.org/nginx/ticket/413 , which was fixed years ago. It's possible the fix doesn't work with OpenSSL-1.1 (it does look pretty hacky). After going down a bunch of other pointless avenues, I recompiled libssl itself to change the default BIO buffer size from 4K to 8K, and that fixed the extra RTT in all cases. I don't think that's the correct or ideal solution here, but it may be what we have to do for now just to get our handshake times back down.

Event Timeline

BBlack created this task.Nov 11 2016, 11:46 PM
Restricted Application added a project: Operations. · View Herald TranscriptNov 11 2016, 11:46 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
BBlack updated the task description. (Show Details)Nov 11 2016, 11:52 PM

Just to double-check things, I've also confirmed that by adding extra copies of the intermediate cert, I can induce the extra RTT without using stapling at all. During these tests, the range of the unknown cutoff value was further constrained to 4158 - 4365.

ema moved this task from Triage to TLS on the Traffic board.Nov 14 2016, 8:06 AM

Mentioned in SAL (#wikimedia-operations) [2016-11-14T15:13:13Z] <bblack> uploaded libssl1.1 1.1.0c-1+wmf2 to jessie-wikimedia/backports - T150561

Mentioned in SAL (#wikimedia-operations) [2016-11-14T15:31:27Z] <bblack> upgrade libssl1.1 package to 1.1.0c-1+wmf2 on cache clusters - T150561

Mentioned in SAL (#wikimedia-operations) [2016-11-14T15:33:17Z] <bblack> cache_misc - seamless nginx restart for libssl1.1 upgrade - T150561

Mentioned in SAL (#wikimedia-operations) [2016-11-14T15:35:05Z] <bblack> cache_maps - seamless nginx restart for libssl1.1 upgrade - T150561

Mentioned in SAL (#wikimedia-operations) [2016-11-14T16:07:56Z] <bblack> cache_text - seamless nginx restart for libssl1.1 upgrade - T150561

Mentioned in SAL (#wikimedia-operations) [2016-11-14T16:19:05Z] <bblack> cache_upload - seamless nginx restart for libssl1.1 upgrade - T150561

Above SAL entries deployed the workaround (setting default libssl buffer size to 8K at compile-time), which solves the immediate issue and restores normal handshake performance.

Leaving this open for now until we resolve how to fix this better for the future - see openssl-users thread here: https://mta.openssl.org/pipermail/openssl-users/2016-November/004835.html

My latest round of benchmarks confirm the fix as well. @BBlack, thank you for investigating & addressing this regression so quickly!

BBlack updated the task description. (Show Details)Nov 14 2016, 6:05 PM
BBlack closed this task as Resolved.Jan 5 2017, 5:52 PM
BBlack claimed this task.

For future references, nginx now (since 1.13.1) workarounds this issue setting TCP_NODELAY before doing the handshake: https://trac.nginx.org/nginx/ticket/413#comment:8

OpenSSL removed access to handshake buffers in commit https://github.com/openssl/openssl/commit/2e7dc7cd688