The defined scope of the 'wikikube' cluster explicitly excludes monitoring tools like Grafana, Kibana, etc.
This means that we need somewhere else to run the centralized parts of Jaeger (writing into storage, and the query + UI elements for retrieving trace data).
We discussed and rejected the idea of running these in a custom Docker / docker-compose setup on either a VM or a baremetal host. We decided it would be about as much work -- and much more reusable -- to simply turn up a small, new k8s cluster, which we decided to term 'aux'. (Another option that was considered was "ancillary", which was thought more expressive but much longer to type.)
The scope for this cluster, at least to start with, would be confined to just observability tools and other SRE-supported critical infrastructure services (for example, Netbox would be considered in-scope). We can broaden this later, but the intent is to avoid the cluster becoming a 'junk drawer'.
With aux as the cluster name prefix, we also decided upon aux-k8s-etcd, aux-k8s-ctrl, aux-k8s-worker as the machine name prefixes, similar to what Data Engineering did with dse-k8s-.
The initial plan is to run all of this on Ganeti, in just one of the core clusters, and to start with just a couple worker nodes.
So, on eqiad Ganeti, we need to turn up:
- 3x aux-k8s-etcd nodes, 1G RAM, 1vcpu each
- 2x aux-k8s-ctrl nodes, 4G RAM, 1vcpu each
- 2x aux-k8s-worker nodes, 16G RAM, 8vcpu each