Page MenuHomePhabricator

Can't reach eqiad or esams from Comcast in Portland, Oregon
Closed, ResolvedPublic

Description

I'm having trouble reaching services directly in eqiad or esams data centers via my Comcast home internet in Portland, Oregon. No problems with codfw or ulsfo, so most wiki pages are fine via the ulsfo proxies but services like *.wmflabs.org fail entirely.

Current public IP is 73.37.60.183.

This may be part of an outage or ongoing problem with Comcast.

$ traceroute media-streaming.wmflabs.org
traceroute to media-streaming.wmflabs.org (208.80.155.156), 30 hops max, 60 byte packets
 1  23.92.24.3 (23.92.24.3)  0.527 ms 23.92.24.2 (23.92.24.2)  1.084 ms 23.92.24.3 (23.92.24.3)  0.665 ms
 2  173.230.159.56 (173.230.159.56)  0.408 ms 173.230.159.60 (173.230.159.60)  0.447 ms 173.230.159.12 (173.230.159.12)  0.374 ms
 3  172.18.0.37 (172.18.0.37)  9.104 ms ce-0-18-0-1.r02.snjsca04.us.bb.gin.ntt.net (192.80.17.169)  1.688 ms  1.686 ms
 4  ae-11.r22.snjsca04.us.bb.gin.ntt.net (129.250.3.120)  3.166 ms  2.944 ms 100ge9-1.core4.fmt2.he.net (184.105.80.181)  18.618 ms
 5  100ge9-1.core1.pao1.he.net (184.105.222.90)  1.369 ms  1.462 ms  1.353 ms
 6  ae-2.r05.asbnva02.us.bb.gin.ntt.net (129.250.2.22)  67.232 ms 100ge6-2.core1.ash1.he.net (184.105.222.42)  63.005 ms ae-20.r06.asbnva02.us.bb.gin.ntt.net (129.250.2.133)  67.987 ms
 7  ae-0.a03.asbnva02.us.bb.gin.ntt.net (129.250.5.194)  67.948 ms wikimedia-as14907.10gigabitethernet5.switch4.ash1.he.net (216.66.30.90)  60.314 ms xe-3-3-3.cr2-eqiad.wikimedia.org (206.126.236.221)  60.594 ms
 8  xe-0-0-28-0.a03.asbnva02.us.ce.gin.ntt.net (129.250.204.190)  67.419 ms  66.279 ms instance-novaproxy-01.project-proxy.wmflabs.org (208.80.155.156)  60.557 ms
 9  * * *
10  * * *
11  * * *
12  * * *
13  * * *
14  * * *
15  * * *
16  * * *
17  * * *
18  * * *
19  * * *
20  * * *
21  * * *
22  * * *
23  * * *
24  * * *
25  * * *
26  * * *
27  * * *
28  * * *
29  * * *
30  * * *
$ curl -v https://text-lb.eqiad.wikimedia.org --insecure
* Rebuilt URL to: https://text-lb.eqiad.wikimedia.org/
*   Trying 2620:0:861:ed1a::1...
* TCP_NODELAY set
*   Trying 208.80.154.224...
* TCP_NODELAY set
* Connection failed
* connect to 2620:0:861:ed1a::1 port 443 failed: Operation timed out
* Connection failed
* connect to 208.80.154.224 port 443 failed: Operation timed out
* Failed to connect to text-lb.eqiad.wikimedia.org port 443: Operation timed out
* Closing connection 0
curl: (7) Failed to connect to text-lb.eqiad.wikimedia.org port 443: Operation timed out
$ curl -v https://text-lb.esams.wikimedia.org --insecure
* Rebuilt URL to: https://text-lb.esams.wikimedia.org/
*   Trying 2620:0:862:ed1a::1...
* TCP_NODELAY set
*   Trying 91.198.174.192...
* TCP_NODELAY set
* Connection failed
* connect to 2620:0:862:ed1a::1 port 443 failed: Operation timed out
* Connection failed
* connect to 91.198.174.192 port 443 failed: Operation timed out
* Failed to connect to text-lb.esams.wikimedia.org port 443: Operation timed out
* Closing connection 0
curl: (7) Failed to connect to text-lb.esams.wikimedia.org port 443: Operation timed out

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

1st look seem to indicate an issue between Telia and Comcast or within Comcast.

ayounsi@bast1002:~$ mtr 73.37.60.183 -z  --report-wide
Start: Fri Jun 29 17:00:29 2018
HOST: bast1002                                      Loss%   Snt   Last   Avg  Best  Wrst StDev
  1. AS14907  ae3-1003.cr2-eqiad.wikimedia.org       0.0%    10    1.2   0.6   0.2   3.0   0.7
  2. AS1299   ash-b1-link.telia.net                  0.0%    10    0.7   0.9   0.7   1.8   0.0
  3. AS1299   comcast-ic-318834-ash-b1.c.telia.net   0.0%    10    0.8   0.7   0.6   0.9   0.0
  4. AS???    ???                                   100.0    10    0.0   0.0   0.0   0.0   0.0
$ sudo mtr bast1002.wikimedia.org -z  --report-wide
Password:
Start: 2018-06-29T10:07:32-0700
HOST: Orac.local                                             Loss%   Snt   Last   Avg  Best  Wrst StDev
  1. AS7922   2601:1c0:5201:b82:2cab:a4ff:fe47:1245           0.0%    10    5.9   5.9   3.5  13.5   2.9
  2. AS7922   2001:558:4060:26::1                             0.0%    10   13.8  21.0  13.4  55.8  12.5
  3. AS7922   ae-225-rur02.beaverton.or.bverton.comcast.net   0.0%    10   17.0  16.1  12.9  27.2   4.2
  4. AS7922   ae-51-ar01.troutdale.or.bverton.comcast.net     0.0%    10   17.6  17.4  14.0  19.5   2.0
  5. AS7922   be-33490-cr01.seattle.wa.ibone.comcast.net     10.0%    10   21.4  23.9  20.1  29.1   3.4
  6. AS7922   be-10846-pe01.seattle.wa.ibone.comcast.net      0.0%    10   27.0  22.8  18.1  28.7   3.5
  7. AS7922   2001:559::536                                   0.0%    10   17.1  22.9  16.8  45.4   9.2
  8. AS???    ???                                            100.0    10    0.0   0.0   0.0   0.0   0.0

(Note the traceroute in the original comment might have been from the wrong window, use the mtr data! )

and in ipv4:

$ sudo mtr bast1002.wikimedia.org -z  --report-wide -4
Start: 2018-06-29T10:10:31-0700
HOST: Orac.local                                             Loss%   Snt   Last   Avg  Best  Wrst StDev
  1. AS???    10.0.0.1                                        0.0%    10   13.1  15.5   5.3  38.6   8.7
     AS7922   96.120.60.13
  2. AS7922   96.120.60.13                                    0.0%    10   15.4  15.4  11.3  23.8   4.1
  3. AS7922   ae-225-rur01.beaverton.or.bverton.comcast.net   0.0%    10   17.5  15.1   9.8  19.5   2.9
  4. AS7922   ae-2-rur02.beaverton.or.bverton.comcast.net     0.0%    10   15.3  15.2  11.2  25.3   4.3
  5. AS7922   ae-51-ar01.troutdale.or.bverton.comcast.net     0.0%    10   13.1  15.3  10.7  21.8   3.3
  6. AS7922   be-33490-cr01.seattle.wa.ibone.comcast.net      0.0%    10   18.2  19.7  17.9  24.2   2.2
  7. AS7922   be-10846-pe01.seattle.wa.ibone.comcast.net      0.0%    10   15.7  18.8  15.7  22.1   2.0
  8. AS7922   66.208.232.186                                  0.0%    10   62.0  25.5  16.4  62.0  14.3
  9. AS1299   62.115.117.49                                   0.0%    10   68.2  67.8  65.1  71.6   2.2
 10. AS???    ???                                            100.0    10    0.0   0.0   0.0   0.0   0.0

Mentioned in SAL (#wikimedia-operations) [2018-06-29T17:21:24Z] <XioNoX> deactivating v6 BGP session to Telia on cr2-eqiad - T198502

Now getting

$ sudo mtr bast1002.wikimedia.org -z  --report-wide
Password:
Start: 2018-06-29T10:22:50-0700
HOST: Orac.local                                             Loss%   Snt   Last   Avg  Best  Wrst StDev
  1. AS7922   2601:1c0:5201:b82:2cab:a4ff:fe47:1245           0.0%    10    2.2  61.3   2.2 200.3  76.4
  2. AS7922   2001:558:4060:26::1                             0.0%    10   20.7  68.5  12.4 219.0  73.6
  3. AS7922   ae-225-rur02.beaverton.or.bverton.comcast.net   0.0%    10   14.1  55.7  12.9 176.8  59.5
  4. AS7922   ae-51-ar01.troutdale.or.bverton.comcast.net     0.0%    10   13.4  39.9  11.9 133.9  42.5
  5. AS7922   be-33490-cr01.seattle.wa.ibone.comcast.net      0.0%    10   23.9  69.1  23.7 208.6  65.6
  6. AS7922   be-10847-pe02.seattle.wa.ibone.comcast.net      0.0%    10   19.7  66.9  19.0 195.0  67.3
  7. AS2914   ae-31.a00.sttlwa01.us.bb.gin.ntt.net            0.0%    10   19.5  68.3  16.9 179.8  65.9
  8. AS2914   ae-9.r04.sttlwa01.us.bb.gin.ntt.net             0.0%    10   19.0  53.5  18.2 138.0  46.5
  9. AS2914   ae-5.r23.sttlwa01.us.bb.gin.ntt.net             0.0%    10   18.9  38.2  16.2 108.8  32.0
 10. AS2914   ae-3.r23.snjsca04.us.bb.gin.ntt.net             0.0%    10   35.1  46.7  31.8  89.6  21.5
 11. AS2914   ae-0.r22.snjsca04.us.bb.gin.ntt.net             0.0%    10   33.7  58.7  33.4 160.1  44.7
 12. AS???    ???                                            100.0    10    0.0   0.0   0.0   0.0   0.0

Mentioned in SAL (#wikimedia-operations) [2018-06-29T17:28:19Z] <XioNoX> Re-activating v6 BGP session to Telia on cr2-eqiad - T198502

I'm affected, too. Comcast in CO.

Traffic takes another path, GTT to us, HE back, but still no luck, so the issue seems to be within Comcast.

Looking at some Netops IRC channels, there seem to be a large ongoing Comcast issue.

Edit:
https://outage.report/us/xfinity
Possible cause: "double fiber cut, JFK <-> ORD; ATL <-> IAD"

translatewiki.net was also affected for me. But both TWN and Gerrit are back for me now.

Seems to have cleared up for me too now. Marking resolved. \o/