Page MenuHomePhabricator

possible routing issue between eqiad and Maxmind network
Closed, ResolvedPublic


We're seeing severe packet loss between the frack payments servers and API servers on Maxmind's network. I tested also from bast1002 since that's a simpler route.

bast1002 (                                                                                Tue Sep 24 01:04:55 2019
Resolver: Received error response 2. (server failure)er of fields   quit
                                                                                  Packets               Pings
 Host                                                                           Loss%   Snt   Last   Avg  Best  Wrst StDev
 1.               0.0%   196    0.2   0.6   0.2  14.0   1.6
 2.                           0.0%   196    0.2   0.4   0.2   7.1   0.6
 3. ???
 4.                                                      96.9%   195    0.5   0.5   0.5   0.7   0.0

Note that routing is clean between codfw-Maxmind and also from my home network to Maxmind. I checked they report all systems operational. I also checked other IPs on their networks with the same results. Is this a peering issue or something else at our end?

Event Timeline

Jgreen created this task.Sep 24 2019, 1:08 AM
Restricted Application added a project: Operations. · View Herald TranscriptSep 24 2019, 1:08 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Jgreen triaged this task as Unbreak Now! priority.Sep 24 2019, 1:12 AM

Flipping this to "Unbreak Now!" since it's a timely issue, and service outage interfering with the donation pipeline. We do have some donation activity at the moment.

Restricted Application added a subscriber: Liuxinyu970226. · View Herald TranscriptSep 24 2019, 1:12 AM
ayounsi claimed this task.Sep 24 2019, 2:03 AM
ayounsi updated the task description. (Show Details)
ayounsi added a subscriber: ayounsi.

All those IPs are behind Cloudflare. Opened a ticket with them.

ayounsi closed this task as Resolved.Sep 24 2019, 2:47 AM

Resolved by Cloudflare.

Dwisehaupt moved this task from Triage to Done on the fundraising-tech-ops board.Feb 13 2020, 9:30 PM