We have a few hostnames in hieradata/common/profile/trafficserver/backend.yaml that should be moved to discovery records for easier operations (e.g. reimage/flip/etc).
The ultimate goal is to simplify operations wrt the current status quo for each service.
Services:
- Grafana
- Logstash
- Pyrra
- Prometheus
- Thanos
Since different services require different strategies, the following sections outline the trade offs and solutions on a per-service basis.
grafana
This is the trickiest of all I think, ideally I (Filippo) would like a single patch or command to flip the active/standby grafana host. Note that whatever points to grafana.w.o should be also reflected in profile::grafana::active_host (and profile::grafana::standby_host) for the "singleton" units (such as syncing ldap users) to follow.
A pontential solution could look like T357384
logstash [done]
This ties in with moving the read path for logs, moving to a confctl controlled discovery.wmnet record would make flipping datacenters for logstash to be quicker and in line with other services too. What do you think @colewhite? - SGTM!
prometheus [done]
We need to point to individual hosts because we're using mod_auth_cas. When we move to oauth2-proxy for SSO authentication then we can replace those with prometheus.svc.SITE.wmnet. This is basically T326657
pyrra (includes slo/slos) [done]
- point to thanos-web.discovery.wmnet