Page MenuHomePhabricator

Broken transit/filtering for wmcs floating IP range 185.15.56.240/29
Closed, ResolvedPublic

Description

I don't know of a reason why 185.15.56.245 is different from other floating IPs in WMCS, but much of the internet can't reach it. Another floating IP in the same tenant (tools) works fine: 185.15.56.60

Event Timeline

We are looking at https://tools.keycdn.com/ping which shows that IP as timing out from SF, Singapore, Bangalore, Sydney, Tokyo. It works from Dallas and parts East.

I used https://tools.keycdn.com/traceroute to get a bunch of traceroutes while chatting in -sre IRC:

I've dumped them in

1Frankfurt
2Start: 2021-01-12T22:48:55+0000
3 Loss Snt Last Avg Best Wrst StDev
4 1.|-- 10.0.10.1 0.0% 4 0.0 0.1 0.0 0.1 0.0
5 2.|-- 62.115.47.24 0.0% 4 0.3 0.6 0.3 1.4 0.5
6 3.|-- 62.115.116.16 25.0% 4 0.4 0.5 0.4 0.6 0.1
7 4.|-- 62.115.114.88 0.0% 4 7.9 7.8 7.7 7.9 0.1
8 5.|-- 62.115.120.240 0.0% 4 8.6 8.6 8.5 8.7 0.1
9 6.|-- 62.115.122.179 0.0% 4 8.0 8.0 8.0 8.1 0.0
10 7.|-- 62.115.145.25 0.0% 4 48.6 18.4 8.1 48.6 20.2
11 8.|-- 91.198.174.252 0.0% 4 8.1 8.1 8.1 8.2 0.0
12 9.|-- 91.198.174.248 0.0% 4 90.0 90.2 89.9 90.8 0.4
13 10.|-- 208.80.154.213 0.0% 4 107.7 109.5 107.6 114.5 3.4
14 11.|-- 185.15.56.245 0.0% 4 94.1 94.1 94.1 94.2 0.0
15Amsterdam
16Start: 2021-01-12T22:48:56+0000
17 Loss Snt Last Avg Best Wrst StDev
18 1.|-- ??? 100.0 4 0.0 0.0 0.0 0.0 0.0
19 2.|-- 10.82.4.40 0.0% 4 0.5 1.3 0.5 3.6 1.6
20 3.|-- 138.197.250.120 0.0% 4 1.0 6.9 0.9 24.7 11.9
21 4.|-- 138.197.250.96 0.0% 4 0.5 0.7 0.5 0.9 0.2
22 5.|-- 80.249.209.176 0.0% 4 1.2 4.6 1.0 14.4 6.5
23 6.|-- 91.198.174.248 0.0% 4 85.7 85.7 84.8 87.3 1.1
24 7.|-- 208.80.154.213 0.0% 4 104.5 113.7 104.5 137.5 15.9
25 8.|-- 185.15.56.245 0.0% 4 88.6 88.9 88.6 89.4 0.4
26London
27Start: 2021-01-12T22:48:56+0000
28 Loss Snt Last Avg Best Wrst StDev
29 1.|-- 46.101.0.254 0.0% 4 0.5 0.6 0.4 1.0 0.3
30 2.|-- 138.197.249.98 0.0% 4 0.9 1.0 0.9 1.2 0.1
31 3.|-- 138.197.251.128 0.0% 4 26.6 7.1 0.5 26.6 13.0
32 4.|-- 138.197.244.65 0.0% 4 9.5 18.2 9.5 43.8 17.0
33 5.|-- 80.249.209.176 0.0% 4 10.1 10.2 10.1 10.4 0.1
34 6.|-- 91.198.174.248 0.0% 4 87.0 87.0 86.9 87.0 0.1
35 7.|-- 208.80.154.213 0.0% 4 102.1 101.8 98.5 104.5 2.5
36 8.|-- 185.15.56.245 0.0% 4 86.7 86.8 86.7 87.1 0.2
37New York
38Start: 2021-01-12T22:48:57+0000
39 Loss Snt Last Avg Best Wrst StDev
40 1.|-- 134.209.160.253 0.0% 4 0.4 8.6 0.4 31.0 15.0
41 2.|-- 138.197.251.12 0.0% 4 1.0 1.0 0.8 1.1 0.1
42 3.|-- 138.197.248.18 0.0% 4 0.5 0.6 0.5 0.7 0.0
43 4.|-- 138.197.244.46 0.0% 4 1.1 1.1 1.0 1.1 0.1
44 5.|-- ??? 100.0 4 0.0 0.0 0.0 0.0 0.0
45 6.|-- 77.109.128.139 0.0% 4 70.7 70.8 70.7 70.9 0.1
46 7.|-- 82.197.168.41 0.0% 4 70.7 70.7 70.7 70.8 0.0
47 8.|-- 77.109.129.13 0.0% 4 76.5 76.6 76.5 76.7 0.1
48 9.|-- 77.109.134.114 0.0% 4 76.8 76.8 76.8 76.9 0.0
49 10.|-- 91.198.174.254 0.0% 4 77.1 77.0 77.0 77.1 0.1
50 11.|-- 91.198.174.248 0.0% 4 83.2 83.2 83.2 83.3 0.0
51 12.|-- 208.80.154.213 0.0% 4 104.7 181.8 96.4 425.0 162.2
52 13.|-- 185.15.56.245 0.0% 4 83.3 83.3 83.2 83.6 0.2
53Dallas
54Start: 2021-01-12T22:48:58+0000
55 Loss Snt Last Avg Best Wrst StDev
56 1.|-- 45.79.12.201 0.0% 4 0.9 0.9 0.7 1.1 0.2
57 2.|-- 45.79.12.0 0.0% 4 0.5 0.5 0.4 0.7 0.1
58 3.|-- 45.79.12.9 0.0% 4 0.6 5.2 0.5 16.6 7.7
59 4.|-- 206.53.202.101 0.0% 4 20.2 6.4 1.8 20.2 9.2
60 5.|-- 208.80.153.205 0.0% 4 35.1 35.0 34.9 35.1 0.1
61 6.|-- 208.80.154.213 0.0% 4 56.2 54.2 51.1 56.5 2.6
62 7.|-- 185.15.56.245 0.0% 4 35.1 34.9 34.9 35.1 0.1
63San Francisco
64Start: 2021-01-12T22:48:59+0000
65 Loss Snt Last Avg Best Wrst StDev
66 1.|-- 104.236.128.254 0.0% 4 1.7 1.8 0.3 4.5 1.8
67 2.|-- 138.197.248.208 0.0% 4 17.4 4.8 0.5 17.4 8.4
68 3.|-- 138.197.248.199 0.0% 4 0.4 0.4 0.3 0.5 0.1
69 4.|-- 206.197.187.82 0.0% 4 0.7 0.6 0.5 0.8 0.1
70 5.|-- 198.35.26.197 0.0% 4 0.6 0.7 0.6 0.7 0.0
71 6.|-- 198.35.26.211 0.0% 4 42.7 42.7 42.6 42.8 0.1
72 7.|-- 208.80.153.220 0.0% 4 69.2 69.3 69.1 69.7 0.2
73 8.|-- 208.80.154.211 0.0% 4 83.9 139.3 82.3 306.7 111.6
74 9.|-- ??? 100.0 4 0.0 0.0 0.0 0.0 0.0
75Singapore
76Start: 2021-01-12T22:48:59+0000
77 Loss Snt Last Avg Best Wrst StDev
78 1.|-- 128.199.191.253 0.0% 4 0.4 1.7 0.4 5.0 2.2
79 2.|-- 138.197.251.188 0.0% 4 1.2 1.1 1.0 1.2 0.1
80 3.|-- 138.197.251.187 0.0% 4 0.4 0.6 0.4 1.1 0.3
81 4.|-- ??? 100.0 4 0.0 0.0 0.0 0.0 0.0
82 5.|-- 103.102.166.139 0.0% 4 205.2 206.3 205.2 209.2 1.9
83 6.|-- 208.80.153.220 0.0% 4 236.5 240.8 236.5 252.3 7.7
84 7.|-- 208.80.154.211 0.0% 4 247.2 318.9 247.2 531.9 142.0
85 8.|-- ??? 100.0 4 0.0 0.0 0.0 0.0 0.0
86Sydney
87Start: 2021-01-12T22:49:01+0000
88 Loss Snt Last Avg Best Wrst StDev
89 1.|-- 45.79.118.1 0.0% 4 0.2 0.2 0.1 0.2 0.0
90 2.|-- 10.216.64.3 0.0% 4 0.6 0.4 0.3 0.6 0.2
91 3.|-- 10.216.32.10 0.0% 4 2.0 1.6 0.9 2.6 0.8
92 4.|-- 10.216.32.7 0.0% 4 0.6 0.5 0.4 0.6 0.1
93 5.|-- 172.105.160.8 0.0% 4 0.6 0.8 0.5 1.6 0.5
94 6.|-- 192.80.16.70 0.0% 4 0.5 0.6 0.5 0.7 0.1
95 7.|-- 129.250.5.44 0.0% 4 11.2 9.2 1.0 23.4 10.5
96 8.|-- 129.250.6.144 50.0% 4 93.4 96.2 93.4 99.0 3.9
97 9.|-- 129.250.2.74 0.0% 4 95.2 94.3 93.0 95.8 1.4
98 10.|-- 116.51.26.210 0.0% 4 95.6 94.7 94.3 95.6 0.6
99 11.|-- 103.102.166.140 0.0% 4 94.8 94.6 94.4 94.8 0.2
100 12.|-- 103.102.166.139 0.0% 4 243.7 243.1 242.5 243.7 0.6
101 13.|-- 208.80.153.220 0.0% 4 262.6 262.1 261.5 262.6 0.6
102 14.|-- 208.80.154.211 0.0% 4 311.2 315.3 274.2 400.0 59.0
103 15.|-- ??? 100.0 4 0.0 0.0 0.0 0.0 0.0
104Tokyo
105Start: 2021-01-12T22:49:01+0000
106 Loss Snt Last Avg Best Wrst StDev
107 1.|-- 139.162.65.3 0.0% 4 1.0 1.2 0.7 2.1 0.6
108 2.|-- 139.162.64.16 0.0% 4 0.7 0.7 0.6 0.8 0.1
109 3.|-- 63.218.251.145 0.0% 4 0.7 0.8 0.7 0.9 0.0
110 4.|-- 63.223.34.138 0.0% 4 84.6 84.8 84.6 85.0 0.1
111 5.|-- 103.102.166.134 0.0% 4 84.5 84.6 84.5 84.8 0.1
112 6.|-- 103.102.166.140 0.0% 4 103.5 103.5 103.4 103.5 0.1
113 7.|-- 103.102.166.139 0.0% 4 223.1 223.9 222.7 227.1 2.2
114 8.|-- 208.80.153.220 0.0% 4 257.0 257.1 257.0 257.2 0.1
115 9.|-- 208.80.154.211 0.0% 4 259.9 328.1 259.9 526.7 132.4
116 10.|-- ??? 100.0 4 0.0 0.0 0.0 0.0 0.0
117Bangalore
118Start: 2021-01-12T22:49:00+0000
119 Loss Snt Last Avg Best Wrst StDev
120 1.|-- ??? 100.0 4 0.0 0.0 0.0 0.0 0.0
121 2.|-- 10.66.4.48 0.0% 4 8.8 2.7 0.5 8.8 4.1
122 3.|-- 138.197.249.22 0.0% 4 0.4 0.5 0.4 0.8 0.2
123 4.|-- 219.65.110.185 0.0% 4 6.3 2.9 1.5 6.3 2.3
124 5.|-- ??? 100.0 4 0.0 0.0 0.0 0.0 0.0
125 6.|-- 180.87.36.9 0.0% 4 10.0 10.5 9.9 12.2 1.1
126 7.|-- 180.87.36.41 0.0% 4 41.9 41.9 41.8 42.1 0.1
127 8.|-- 120.29.215.34 0.0% 4 42.5 42.6 42.5 42.9 0.2
128 9.|-- 180.87.164.62 0.0% 4 43.0 42.8 42.7 43.0 0.1
129 10.|-- 103.102.166.140 0.0% 4 40.9 41.1 40.8 41.5 0.3
130 11.|-- 103.102.166.139 0.0% 4 246.4 246.5 246.4 246.5 0.0
131 12.|-- 208.80.153.220 0.0% 4 246.6 246.6 246.6 246.6 0.0
132 13.|-- 208.80.154.211 0.0% 4 260.9 305.0 260.9 427.5 81.8
133 14.|-- ??? 100.0 4 0.0 0.0 0.0 0.0 0.0

The 2nd to last hop on the working ones is always 208.80.154.213 (irb-1103.cloudsw1-d5-eqiad.wikimedia.org) but 208.80.154.211 (irb-1102.cloudsw1-c8-eqiad.wikimedia.org) fails.

Unfortunately, my networking skills end there.

Floating IPs in eqiad1 are from network 5c9ee953-3a19-4e84-be0f-069b5da75123 which is associated with two subnets:

efbb8c8a-1397-4faf-a07f-e9bcc33899b5: 185.15.56.0/25 aka 185.15.56.2-185.15.56.126
7c6bcc12-212f-44c2-9954-5c55002ee371: 185.15.56.240/29 aka 185.15.56.242-185.15.56.246

I'm going to venture a guess that that second subnet is largely unknown and ignored when filtering rules are made, so that policy is inconsistent for that last /29.

I'm not sure if the right solution is to fix the /29 or just stop using it.

I've changed ttls for that domain to 60, will switch to a lower IP after the ttls refresh. Then we can figure out what to do about .245

I've moved toolserver to org to an IP on the lower subnet: 185.15.56.62.

Nevertheless, there remains the question of what to do about 185.15.56.240/29. I see a few IPs allocated from that range, so if anyone is using them they're probably running into trouble.

Andrew renamed this task from Broken transit for toolserver.org/185.15.56.245 to Broken transit/filtering for wmcs floating IP range 185.15.56.240/29.Jan 13 2021, 12:20 AM

Arturo, I'm handing this off to you -- we need to either document this better or stop using the upper IP range.

An example of 'documentation' is in modules/network/data/data.yaml:

eqiad:
  private:
    cloud-instances2-b-eqiad:
      ipv4: 172.16.0.0/21
  public:
    cloud-eqiad1-floating:
      ipv4: 185.15.56.0/25

so 185.15.56.240/29 is not a CIDR meant to allocate floatings IP from. Is just a interlink subnet. Perhaps the problem here is neutron allowing floating IP allocation from the wrong subnet, something I've seen before...:

Mentioned in SAL (#wikimedia-cloud) [2021-01-13T10:02:03Z] <arturo> prevent floating IP allocation from neutron transport subnet: root@cloudcontrol1005:~# neutron subnet-update --allocation-pool start=185.15.56.244,end=185.15.56.244 cloud-instances-transport1-b-eqiad1 (T271867)

Mentioned in SAL (#wikimedia-cloud) [2021-01-13T10:02:40Z] <arturo> delete floating IP allocation 185.15.56.245 (T271867)

Mentioned in SAL (#wikimedia-cloud) [2021-01-13T10:05:24Z] <arturo> release and delete floating IP 185.15.56.242 (docker-registry.toolsbeta.wmflabs.org) (T271867)

Mentioned in SAL (#wikimedia-cloud) [2021-01-13T10:07:13Z] <arturo> allocate floating IP 185.15.56.84, and use it for docker-registry.toolsbeta.wmflabs.org (instance toolsbeta-docker-registry-01) (T271867)

aborrero triaged this task as Medium priority.Jan 13 2021, 10:14 AM
aborrero moved this task from Inbox to Doing on the cloud-services-team (Kanban) board.

Thanks everyone for looking into this