Page MenuHomePhabricator

Make toolforge k8s service names a CNAME to .svc. to allow editing without cloudinfra access
Closed, ResolvedPublic

Description

11:41:16 <Majavah> where does the "k8s.toolsbeta.eqiad1.wikimedia.cloud." record live? it's not in .svc. so I don't see it on horizon
11:45:24 <Majavah> (same thing for tools.)
12:03:21 <arturo> it should be in the parent zone
12:03:47 <arturo> it was a mistake (my mistake), it should have been in the .svc subdomain since the beginning
12:04:06 <arturo> now changing it is complicated because it involves TLS certs
12:25:27 <Majavah> could it be changed to a cname to something on .svc so that it can be changed without cloudinfra access?
12:26:07 <Majavah> I'm working on adding keepalived ha for the kubernetes haproxies, alternatively I could just ask you to change it when it's ready
12:39:42 <arturo> Majavah: about the CNAME yeah I think we can do that. But I prefer not to do such thing on friday. Would you mind opening a phab task so we don't forget next week?

Event Timeline

Majavah renamed this task from Make toolforge k8s service names a CNAME to .svc. to allow editing without toolsbeta access to Make toolforge k8s service names a CNAME to .svc. to allow editing without cloudinfra access.May 7 2021, 9:49 AM

hey @Bstorm is this OK from your point of view?

If we try it in toolsbeta and the certs all still validate, sure. My worry is that cert validation will collapse unless we make sure it's a valid altname for the k8s cluster as well.

We may discover that it doesn't matter, but we will be rebuilding the toolsbeta cluster if it doesn't work out. It's really all about how much we need to rebuild and change. If we are ok rebuilding the toolsbeta cluster a few times (it's about time right?), we can experiment and get it right. I'll be surprised if we can just do what this task says and get away with it. I mean, we might. I don't remember what alt names are on the cert :)

I wonder if we can regenerate the certs via kubeadm with a different altname on the controlplane or something. k8s clients validate the name, kubelet will validate the name and the control plane will. Inside the cluster, it's using a service name and probably will be valid from service cluster names anyway.

The name target doesn't "change" unless all kubeconfigs do as well, which is important.

Wait...I have the sense of this reversed....The current name becomes a CNAME to the svc name...that would Just Work ™

I think we should try it.

Mentioned in SAL (#wikimedia-cloud) [2021-05-07T16:16:19Z] <Majavah> create record k8s.svc.toolsbeta.eqiad1.wikimedia.cloud. pointing to haproxy vip T282227

It looks like I have to delete it and recreate it as a CNAME. That means that it will briefly cause some chaos on Toolforge when we do it in tools. We might want to do it really fast via CLI there. In toolsbeta, I can do it now.

Mentioned in SAL (#wikimedia-cloud) [2021-05-07T16:30:29Z] <bstorm> recreated k8s.toolsbeta.eqiad1.wikimedia.cloud. as a CNAME to k8s.svc.toolsbeta.eqiad1.wikimedia.cloud. T282227

The caching is frustratingly strong here. The old A record is still seen by the host somehow (if not by dig).

bstorm@toolsbeta-sgebastion-04:~$ kubectl get pods --all-namespaces
Unable to connect to the server: dial tcp 172.16.0.146:6443: connect: no route to host

For now, I cannot get the cached A record to be forgotten by the host. Tried restarting nscd, and that did not help. Turning the old proxies off just brought everything down. My thought is to leave the old proxies up until the cache drops off. That would also be the best process for tools anyway and potentially a zero-downtime method.

IF this works when the cache drops off. I think it will, but we should let it be proven before trying in tools.

Oh! There's no stale cache on the control nodes.

bstorm@toolsbeta-test-k8s-control-4:~$ host k8s.toolsbeta.eqiad1.wikimedia.cloud
k8s.toolsbeta.eqiad1.wikimedia.cloud is an alias for k8s.svc.toolsbeta.eqiad1.wikimedia.cloud.
k8s.svc.toolsbeta.eqiad1.wikimedia.cloud has address 172.16.2.161

kubectl works there, and everything. I think that means this is a go. When the cache on the bastion drops off, I can clean up the old haproxies.

Mentioned in SAL (#wikimedia-cloud) [2021-05-07T17:00:36Z] <bstorm> deleted "toolsbeta-test-k8s-haproxy-2", "toolsbeta-test-k8s-haproxy-1" when the dns caches finally dropped T282227

Mentioned in SAL (#wikimedia-cloud) [2021-05-07T17:12:27Z] <bstorm> created A record of k8s.svc.tools.eqiad1.wikimedia.cloud pointing at current cluster with TTL of 300 for quick initial failover when the new set of haproxy nodes are ready T282227

Mentioned in SAL (#wikimedia-cloud) [2021-05-07T17:15:40Z] <bstorm> recreated recordset of k8s.tools.eqiad1.wikimedia.cloud as CNAME to k8s.svc.tools.eqiad1.wikimedia.cloud T282227

Bstorm claimed this task.

This works and should be good to go. I set the TTL on the k8s.svc.tools.eqiad1.wikimedia.cloud to 300 because I know you are going to change it to a new cluster soon. Feel free to change that to 3600 (the default) after you've rebuild the haproxy nodes. @Majavah