We need to create a structure on-disk, based on revision control, that gets synced to the etcd cluster. this structure will contain the "static" config, while the dynamic "state" of the single key will be determined at runtime by changing etcd.
Terminology used in the remainder of this document:
- cluster: The value of the $cluster variable in puppet for the given node (either explicitly set or fetched from hiera)
- datacenter: The value of the $::site variable in puppet for the given node
- pool: the ensemble of nodes that are the backends of e.g. a pybal virtual IP
- service: an individual service running on a specific node
So a single "pool" would be identified by the (datacenter, cluster, service) tuple. For example, eqiad, cache_text, varnish-frontend identifies the list of hosts that a specific pybal service (text-lb.svc.eqiad.wikimedia.org:80 IIRC) uses as backends.
Proposal
While on disk we'd probably like to have a structure that is based on nodes rather than on services, so that the information is as normalized as possible, on etcd we'd like to have the data aggregated by-service, so that they are easy to query, reducing the amount of parsing the clients have to perform.
So one possible structure we should maintain as a yaml file per datacenter (e.g. eqiad/pools.yaml) would be
cache_text: cp1052: services: ['nginx', 'varnish-fe', 'varnish-be'] cp1053: services: [...] ... appservers: mw1018: services: ['apache'] ...
and have an additional file for describing individual services (this time a single file, services.yaml):
cache_text: varnish-fe: port: 80 default_values: { pooled: no, weight: 10 } ... appservers: apache: user: foo default_values: { pooled: yes, weight: 1 }
A specific sync script will then see to replicate this to a more-convenient structure on the distributed k/v store in a denormalized form:
/pools/datacenter/cluster/service/node
/pools/eqiad/cache_text/varnish-fe/cp1052 => {pooled: yes, weight: 10000} /pools/eqiad/cache_text/varnish/fe/cp1053 => {pooled: yes, weight: 10000} ... /pools/eqiad/cache_text/varnish-be/cp1052 => {pooled: yes, weight: 10000} /pools/eqiad/cache_text/varnish-be/cp1053 => {pooled: yes, weight: 10000} ... /pools/eqiad/appservers/mw1018 => {pooled: no, weight: 10000}
A command-line tool will be provided to easily change the state of a resource. Its syntax will be something like
$ conftool --datacenter eqiad --cluster cache_text --service varnish-fe "cp1052 pool=no; cp1053 pool=yes:weight=20000" Node cp1052 depooled from service varnish-fe Node cp1053 pooled in service varnish-fe with weight 20000
Note that referring to such a structure in puppet will be very easy, as (apart from the "service" label) everything has a 1:1 correspondence in puppet.
Any suggestion is welcome, I am building the base blocks for both tools right now, so some decision would be useful in a near future.
Also note that any structure we choose now can be prefixed with a version number, like /v1/pools so that any future migration will be easier. (I am leaving the "/v1/pools vs /pools_v1" yakshaving session to another ticket)