As already pre-announced in https://wikitech.wikimedia.org/wiki/Cumin#Host_selection pretty much since the beginning, it's about time to add the support for multiple queries into Cumin. I've already had a chat with @Joe about it and this summary includes also his suggestions.
This will allow to:
- mix results from different backends (like puppetdb and a future conftool/etcd backend)
- overcome backend's limitations (like mixing fact and resource queries in the puppetdb backend)
- simplify the alias management
This of course as a cost for the user given that now it will need to give the information about which backend should be queried for each block. There are basically two approaches:
- force the user to specify always which backend to use, even for the simplest query
- PROs: explicit, the user doesn't need to know/remember which is the default backend
- CONs: non-backward compatible, more verbose (see below for possible implementations)
- keep the default backend setting in the configuration file and try to parse the query with that backend first, if it doesn't parse use the general multi-query grammar.
- PROs: backward compatible (current working queries will continue to work), hence more usable
- CONs: implicit, the user needs to know which is the default backend (seems pretty obvious right now for us with puppetdb but this assumption might not be true in general)
My first choice would be towards (1) being more generic and clean, and assuming that the alias support will reduce sensibly the necessity to write queries each time. But (2) has the big advantage of being backward compatible and less verbose, so probably a better choice.
Given the young nature of the tool I would just like to avoid to make choices now that might be regretted later. I'd rather do some drastic change now than later on.
Regarding the grammar syntax itself my proposal would be to use a letter to identify the backend, P for puppetdb, D for direct (it could be more than one char but I would like to keep it short) and curly braces to enclose the subqueries, this because parentheses and square brackets are already used by existing grammars.
Here some examples:
# Full complex query with all the grammar features (P{R:Class = Foo::Bar and R:Class%param = value} and P{F:has_ipmi = true and F:is_virtual = false} and not A:hosts_down) or D{special_host.wikimedia.org} # Simple all hosts in puppetdb, if we go for option (1) P{*}
Any comment and feedback is appreciated.