Page MenuHomePhabricator

Report of esams unreachable from Fastweb/Init7
Closed, ResolvedPublic

Description

I've received a report that esams is unreachable from (some parts of?) fastweb ATM


#11023 (15.05.17 02:00): Outage: Core router in Amsterdam, Vancis connectivity issues
Duration: 15.05.17 02:00 -
Severity (1-5): 1
Status: In Progress
Description: We are experiencing connection problems to our core router r1ams1.core in Amsterdam, Vancis. Affected are all customer traffic as well as BGP sessions at AMS-IX

Update: 13:53 (CEST)
Link to AMS-IX has been disabled. Our network connection are stable again.

https://www.as13030.net/status.php#collapse_11023


traceroute as seen from 2.235.74.121

traceroute to www.wikipedia.org (91.198.174.192), 64 hops max, 52 byte packets
 1  192.168.1.254 (192.168.1.254)  3145.130 ms  2.139 ms  2.169 ms
 2  10.56.187.3 (10.56.187.3)  4.892 ms  3.542 ms  3.242 ms
 3  10.251.130.25 (10.251.130.25)  3.720 ms  3.535 ms  3.481 ms
 4  10.251.126.32 (10.251.126.32)  4.881 ms  3.839 ms  4.020 ms
 5  10.251.127.1 (10.251.127.1)  4.639 ms  3.930 ms  5.782 ms
 6  10.251.131.194 (10.251.131.194)  4.509 ms  9.883 ms  23.493 ms 
 7  10.0.1.61 (10.0.1.61)  5.355 ms  4.488 ms  4.684 ms
 8  10.254.12.29 (10.254.12.29)  5.583 ms  5.564 ms
    10.254.9.185 (10.254.9.185)  4.824 ms
 9  10.254.20.110 (10.254.20.110)  8.125 ms
    10.254.12.141 (10.254.12.141)  12.098 ms
    10.254.12.197 (10.254.12.197)  4.866 ms
10  62-101-124-125.fastres.net (62.101.124.125)  5.128 ms
    62-101-124-129.fastres.net (62.101.124.129)  7.290 ms  6.278 ms
11  89.96.200.46 (89.96.200.46)  4.382 ms
    89.96.200.122 (89.96.200.122)  5.413 ms
    93-63-100-170.ip27.fastwebnet.it (93.63.100.170)  5.624 ms
12  r1fra3.core.init7.net (80.81.192.67)  14.067 ms
    89.96.200.46 (89.96.200.46)  6.600 ms
    89.96.200.122 (89.96.200.122)  4.346 ms
13  r1fra3.core.init7.net (80.81.192.67)  13.747 ms  13.878 ms *
14  * * *

Event Timeline

I tried a traceroute from our side and it takes a different path

filippo@cr1-esams> traceroute 2.235.74.121 
traceroute to 2.235.74.121 (2.235.74.121), 30 hops max, 40 byte packets
 1  ae1-402.cr2-esams.wikimedia.org (91.198.174.252)  0.702 ms  12.652 ms  15.271 ms
 2  i00ams-015.ip-plus.net (80.249.208.100)  1.402 ms  1.516 ms  1.522 ms
 3  i62bsw-025-hun0-15-0-0.bb.ip-plus.net (138.187.129.198)  11.926 ms  13.115 ms  12.880 ms
 4  i79zhh-015-hun13-0-0.bb.ip-plus.net (138.187.129.63)  12.285 ms  16.958 ms  12.202 ms
 5  fastweb-corelli-ten2-0.bb.ip-plus.net (193.5.122.95)  17.334 ms  18.155 ms  17.889 ms
 6  * * *

Traffic is indeed not smooth as usual on the interface toward Init7.
I called Init7 and disabled the v4 and v6 BGP sessions.
The person I had on the phone mentioned that the engineers were working on an issue but didn't know if it was related. An Engineer will call me back within 10min

Got confirmation on IRC that the issue can't be reproduced.

From Init7:

We are experencing some BGP issues in our backbone. Troubleshooting is under way and I'll contact you once we fixed the issue.

Nemo_bis renamed this task from Report of esams unreachable from fastweb to Report of esams unreachable from Fastweb/Init7.May 15 2017, 10:06 AM
Nemo_bis triaged this task as High priority.
Nemo_bis updated the task description. (Show Details)

My connection is chaotic since this morning. Other customers from the french ISP Bouygues report the same problem. This is my traceroute results:

WinMTR statistics
Host - %SentRecvBestAvrgWrstLast
bbox.lan - 033933901291
No response from host - 1006800000
be32.cbr01-ntr.net.bbox.fr - 0339339712428
No response from host - 1006800000
lag23.rpt02-civ.net.bbox.fr - 47120640116215
ae2.cr2-esams.wikimedia.org - 033933912174417
text-lb.esams.wikimedia.org - 033933912162620
____________________________________________________________________________________

Mentioned in SAL (#wikimedia-operations) [2017-05-23T14:09:26Z] <XioNoX> re-enabling BGP session to Init7 - T165288

From Init7:

Update: 2017.05.17 09:00 (CEST)
First link to AMS-IX has been enabled.
Update: 2017.05.18 09:40 (CEST)
Second link to AMS-IX has been enabled. Everything looks stabled. We are closing the ticket.

Traffic is flowing again. Please reopen if any issue.