Page MenuHomePhabricator

DNS recursors TCP retransmits
Closed, DeclinedPublic

Description

thanks to https://gerrit.wikimedia.org/r/c/operations/puppet/+/476393 we now have the DNS recursors in the TCP retransmits panel.

https://grafana.wikimedia.org/dashboard/db/network-performances-global?panelId=18&fullscreen&edit&tab=alert&orgId=1&from=now-30m&to=now

They all show up with ~10% retransmits rate, which is pretty high.

Digging a bit more (on dns2001), it seems like the following pattern keeps repeating itself tor every single TCP handshakes on port 53:

No.     Rel time       Source Mac            Source                Destination           Src port Dest Mac              Dst port Protocol Info
      9 0.000084       40:a8:f0:2c:66:e8     208.80.153.69         208.80.153.77         36196    d0:94:66:5f:6a:40     53       TCP      36196 → 53 [SYN] Seq=0 Win=29200 Len=0 MSS=1460 SACK_PERM=1 TSval=2560829592 TSecr=0 WS=512
     10 0.000028       d0:94:66:5f:6a:40     208.80.153.77         208.80.153.69         53       40:a8:f0:2c:66:e8     36196    TCP      53 → 36196 [SYN, ACK] Seq=0 Ack=1 Win=28960 Len=0 MSS=1460 SACK_PERM=1 TSval=1759664423 TSecr=2560829592 WS=512
     11 0.000067       40:a8:f0:2c:66:e8     208.80.153.69         208.80.153.77         36196    d0:94:66:5f:6a:40     53       TCP      36196 → 53 [ACK] Seq=1 Ack=1 Win=29696 Len=0 TSval=2560829592 TSecr=1759664423
     13 0.989525       d0:94:66:5f:6a:40     208.80.153.77         208.80.153.69         53       40:a8:f0:2c:66:e8     36196    TCP      [TCP Retransmission] 53 → 36196 [SYN, ACK] Seq=0 Ack=1 Win=28960 Len=0 MSS=1460 SACK_PERM=1 TSval=1759664680 TSecr=2560829592 WS=512
     14 0.000070       40:a8:f0:2c:66:e8     208.80.153.69         208.80.153.77         36196    d0:94:66:5f:6a:40     53       TCP      [TCP Dup ACK 11#1] 36196 → 53 [ACK] Seq=1 Ack=1 Win=29696 Len=0 TSval=2560829848 TSecr=1759664423
     53 0.345973       d0:94:66:5f:6a:40     208.80.153.77         208.80.153.69         53       40:a8:f0:2c:66:e8     36196    TCP      53 → 36196 [FIN, ACK] Seq=1 Ack=1 Win=29184 Len=0 TSval=1759665222 TSecr=2560829848
     54 0.000232       40:a8:f0:2c:66:e8     208.80.153.69         208.80.153.77         36196    d0:94:66:5f:6a:40     53       TCP      36196 → 53 [FIN, ACK] Seq=1 Ack=2 Win=29696 Len=0 TSval=2560830391 TSecr=1759665222
     55 0.000035       d0:94:66:5f:6a:40     208.80.153.77         208.80.153.69         53       40:a8:f0:2c:66:e8     36196    TCP      53 → 36196 [ACK] Seq=2 Ack=2 Win=29184 Len=0 TSval=1759665222 TSecr=2560830391

In the capture above it seems like:

  1. the LVS (208.80.153.69) starts the handshake (SYN) - No 9
  2. dns2001 replies with a SYN-ACK - No 10
  3. LVS sends the expected final ACK, the server receives it (as the capture is done on the dns2001 side), but never registers that ACK - No 11
  4. and thus ~1s later, sends another SYN-ACK (retransmits) - No 13
  5. The LVS gets the 2nd SYN-ACK, and replies with an (dup) ACK - No 14
  6. 1/3s later, I believe because no new packets have arrived, dns2001 ack the DUP ACK, while asking the LVS to close the session (ack# similar to the dup ACK) - No 53
  7. TCP session gets closed properly - No 54/55

Note that a 15s capture doesn't show any real DNS traffic on port 53, only TCP handshakes, most likely health checks.
This doesn't seem to impact the healthchecks (no alarms).
If there is TCP DNS traffic, this could cause a delay of X seconds, where X >= 1, depending on how often the retransmits happen

Question is why the server doesn't register the original ACK (final 3 way handshake step)?

Related Objects

Event Timeline

ayounsi triaged this task as Medium priority.Dec 4 2018, 5:43 PM
ayounsi created this task.
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
ayounsi lowered the priority of this task from Medium to Low.Dec 4 2018, 6:22 PM
ayounsi added a project: PyBal.

Doing a DNS query over TCP from bast2001 to dns2001 (directly) or dns2002 (via the LVS VIP dig @dns-rec-lb.codfw.wikimedia.org en.wikipedia.org +tcp) doesn't show any retransmits. So the issue seems to be limited to pybal healthchecks.

These are still present AFAIK, and we're fairly certain it's just due to pybal healthchecks using blank/broken TCP connections to monitor them. That will be cleaned up in T239993 when we get rid of LVS-based recdns.