We should setup an experimental, write-by-whitelist (to begin with) read-for-all ES cluster on tool labs, since there are several tools that would benefit from such a thing (Originally suggested by @Halfak and @Capt_Swing for a teahouse search project), and the cluster that stashbot uses can be just part of labs. This will also provide experience building for future logging experiments.
This should be easy but don't forget that all the horrible things one can do to DOS/crash an Elasticsearch node are query based. Will we want to expose Elasticsearch outside of the tools project?
authenticating proxy in front
What will we auth against? Labs LDAP?
I'm assuming that at least the end goal here will be to isolate write access by index (eg stashbot can write to statshbot-* but not teahouse-*). Do we need to care about that at all at first?
Right, so by default let's not expose this outside the tools project, and make up DOS protections as we go along :D
Ah, so the other things that authenticate authenticate against.... identd. I would say we can even get away with something like basic auth to begin with and then decide what to do afterwards.
- 3 XL jessie instances in the tools project
- new ::role::toollabs::elasticsearch Puppet role to provision:
- reverse proxy that restricts writing to Elasticsearch instances
The initial reverse proxy will probably be nginx with basic auth protection for POST requests. That will let us get started with other bits. Eventually this would be replaced with a custom proxy that allows more fine grained restrictions.