Page MenuHomePhabricator

Configure ingress internal DNS records
Closed, ResolvedPublic

Description

We need to create internal DNS records (probably superset.eqiad.wmnet and superset-next.eqiad.wmnet as CNAME records pointing to k8s-ingress-dse.svc.eqiad.wmnet.

Event Timeline

brouberol renamed this task from Configure ingress DNS records to Configure ingress internal DNS records.Feb 2 2024, 8:04 AM
brouberol created this task.

Change 995174 had a related patch set uploaded (by Brouberol; author: Brouberol):

[operations/dns@master] Add superset/superset-next.svc.eqiad.wmnet records

https://gerrit.wikimedia.org/r/995174

Mentioned in SAL (#wikimedia-analytics) [2024-02-06T13:39:10Z] <brouberol> add new TLS SANs to the superset/superset-next certificates in dse-k8s-eqiad - T356481

Change 997858 had a related patch set uploaded (by Brouberol; author: Brouberol):

[operations/puppet@production] superset: setup dyna mapping rules

https://gerrit.wikimedia.org/r/997858

Gehel triaged this task as High priority.Feb 9 2024, 1:36 PM

Change 995174 merged by Brouberol:

[operations/dns@master] Add superset/superset-next.svc.eqiad.wmnet records

https://gerrit.wikimedia.org/r/995174

brouberol@dns1004:~$ dig +short superset-next.svc.eqiad.wmnet superset.svc.eqiad.wmnet
k8s-ingress-dse.svc.eqiad.wmnet.
10.2.2.91
k8s-ingress-dse.svc.eqiad.wmnet.
10.2.2.91
brouberol@cumin1002:~$ curl -v https://superset-next.svc.eqiad.wmnet:30443/health
* Uses proxy env variable no_proxy == 'wikipedia.org,wikimedia.org,wikibooks.org,wikinews.org,wikiquote.org,wikisource.org,wikiversity.org,wikivoyage.org,wikidata.org,wikiworkshop.org,wikifunctions.org,wiktionary.org,mediawiki.org,wmfusercontent.org,w.wiki,wmnet,127.0.0.1,::1'
*   Trying 10.2.2.91:30443...
* Connected to superset-next.svc.eqiad.wmnet (10.2.2.91) port 30443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*  CAfile: /etc/ssl/certs/ca-certificates.crt
*  CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=superset-next.discovery.wmnet
*  start date: Feb  6 13:35:00 2024 GMT
*  expire date: Mar  5 13:35:00 2024 GMT
*  subjectAltName: host "superset-next.svc.eqiad.wmnet" matched cert's "superset-next.svc.eqiad.wmnet"
*  issuer: C=US; L=San Francisco; O=Wikimedia Foundation, Inc; OU=SRE Foundations; CN=discovery
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x55909bdb7620)
> GET /health HTTP/2
> Host: superset-next.svc.eqiad.wmnet:30443
> user-agent: curl/7.74.0
> accept: */*
>
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* old SSL session ID is stale, removing
* Connection state changed (MAX_CONCURRENT_STREAMS == 2147483647)!
< HTTP/2 200
< server: istio-envoy
< date: Fri, 09 Feb 2024 14:15:48 GMT
< content-type: text/html; charset=utf-8
< content-length: 2
< x-frame-options: SAMEORIGIN
< x-xss-protection: 1; mode=block
< x-content-type-options: nosniff
< content-security-policy: base-uri 'self'; default-src 'self'; img-src 'self' blob: data:; worker-src 'self' blob:; connect-src 'self' https://api.mapbox.com https://events.mapbox.com; object-src 'none'; style-src 'self' 'unsafe-inline'; script-src 'self' 'strict-dynamic' 'nonce-xwTUQ6z_gt5NndSTWFFBGjefB22LeVid'
< strict-transport-security: max-age=31556926; includeSubDomains
< referrer-policy: strict-origin-when-cross-origin
< vary: Accept-Encoding
< x-envoy-upstream-service-time: 1026
<
* Connection #0 to host superset-next.svc.eqiad.wmnet left intact
OK

Change 997858 merged by Brouberol:

[operations/puppet@production] superset: setup dyna mapping rules

https://gerrit.wikimedia.org/r/997858

It seems that https://superset-k8s.wikimedia.org and https://superset-next-k8s.wikimedia.org return 502 errors.

It might be expected for https://superset-k8s.wikimedia.org, as the current login layer, mirrored from our bare metal deployment, is causing loops between / --> /superset/welcome/ --> /login/ --> /...

However, I have deployed a different auth strategy to superset-next , causing / to redirect to /superset/welcome, which redirects to /login, which returns a 200.

I've reached out to @Fabfur and we're not sure of what's happening exactly, at the moment.

We figured it out. ATS was sending a request with the header Host: superset-next-k8s.wikimedia.org, which was not part of the Ingress alternative FQDNs. Once that was fixed, we started seeing requests coming through and being responded with 200s.