Now that cloudelastic has elasticsearch installed (T214921), we need to define how this cluster should be used, which will influence how we setup access to it, how we import data and other steps that need to happen before it is fully usable. The list below is non exhaustive and is meant as the start of a discussion, not a definitive answer.
Update are asynchronous. The best case scenario is that changes are searchable after a few minutes, but it is perfectly normal to have lag up to a few hours, in particular during regular maintenance. We need to ensure that no workflow based on cloudelastic relies on low update lag. One way to do that is to introduce artificial lag, and keep the lag higher than a few hours at all time. An extreme version of this could be to only do weekly refresh of cloudelastic.
Data structure changes:
Overtime, we will refine the way we index documents into elasticsearch. This can include adding or removing fields, changing analyzers configuration or a number of other changes. Tools using cloudelastic should be aware that there will be breaking changes. We might want to define and document a subset of fields that we consider stable.
Elastic is known to not be afraid of breaking backward compatibility often. Tools will break in unexpected ways during upgrades. We will need to communicate upgrade schedule ahead of time. We will probably not be able to provide an environment to test version N+1, so tool owner will bear the burden of testing the compatibility of their tools. We need a way to forward deprecation warnings from the elasticsearch logs to each tool owner.
We currently have 3 elasticsearch clusters, each wiki is mapped to one of those clusters. This mapping should not be considered stable. We need to provide a way for clients to be routed to the correct cluster, or to be able to discover this mapping in an automated way. We should probably restrict cross indices searches since they would only work inside a single cluster.
Quota / rate limiting:
At some point, we will probably need to introduce some form of per user quota or rate limiting. This will require some form of authentication. We should start early on authentication (T220069: Build authenticating reverse proxy for Cloud CirrusSearch replicas), the actual rate limiting can come later.
We want updates to only come from production. This cluster should not be used as a generic elasticsearch cluster where anyone can index their own dataset.