Current state: One LVS + dnsdisc for a service on Kubernetes
Each service running on Kubernetes does reserve a TCP port on which each Kubernetes Node listens on. We then configure a service::catalog entry including LVS and dnsdisc for each of those services.
This gives us all the benifits of being able to pool/depool services quickly and indepentendly as well as central configuration for monitoring by also using well known (to other SREs) techniques.
On the downside, adding dedicated LVS involves quite a bit of manual labor which we would like to decrease, especially for low-traffic services and services not-yet in full production mode.
New state: One "meta" LVS (Kubernetes Ingress) + dedicated dnsdisc for a service on Kubernetes
On each Kubernetes Node (well, on most of them) we now run envoy, which binds to tcp/30443 and has a dedicated LVS and dnsdic setup k8s-ingress-wikikube (like all the other services decribed above). This envoy does TLS termination and can fan out to multiple "workload/real services" (running on Kubernetes) without the need for them to have a dedicated LVS setup.
We would like to keep adding those services to the service::catalog and add dnsdisc by using the same LVS VIPs as k8s-ingress-wikikube. So we are still able to pool/depool them individually and benefit from standard monitoring setups.
Unfortunately that created a relationship between the dnsdisc record for the services and the dnsdisc record for k8s-ingress-wikikube. Services may be pooled/depooled individually, but depooling k8s-ingress-wikikube in one DC means all services depending on that will have to be depooled as well.
The guinea pig for this is miscweb, where I've already removed the LVS stanza from service::catalog in:
- https://gerrit.wikimedia.org/r/c/operations/puppet/+/770504/8
- https://gerrit.wikimedia.org/r/c/operations/puppet/+/775319
The second patch was needed because I obviously messed up the port in the first patch but also because monitor.pp does not create icinga host config for servies without lvs: stanza in service::catalog. The latter is going away with the new probes structure now paging (T291946) and monitoring: slowly phased out, so I guess that may be ignored.
The open question now is how to account for the relationship between dnsdisc records.
Possible easy way forward
After another chat with @Joe today we agreed on going what we think is the less work intensive way forward for now. This is accepting the loss of possibility to "easily" depool services under Ingress (via conftool) but sticking with an individual service::catalog entry as well as CNAME (pointing to k8s-ingress-wikikube.discovery.wmnet in default state).
We have two options there:
- We could use CNAME's in the usual form (like SERVICE.discovery.wmnet) which has potential benefits as existing tooling is/might be tailored towards that. The downside is that those CNAME's might be easily confused with dnsdisc records, leading people to look into the wrong direction.
- Use a different DNS domain (we already have SERVICE.k8s-staging.discovery.wmnet) for those CNAME's to not confuse them with "real" dnsdisc. This could be more work as existing tooling might need to be adapted and we would need to refactor names again if we should later decide to implement some relationship between dnsdisc.
Further things to consider regarding service::catalog entries for services under ingress:
- The monitoring: stanza can't be added as having that without lvs: breaks icinga. Can potentially be ignored (T291946), see above.
- It's currently not clear (to me) if the absence of the discovery stanza has any implications besides not having dnsdisc