Page MenuHomePhabricator

Define constraints for cloudelastic use cases
Open, MediumPublic


Now that cloudelastic has elasticsearch installed (T214921), we need to define how this cluster should be used, which will influence how we setup access to it, how we import data and other steps that need to happen before it is fully usable. The list below is non exhaustive and is meant as the start of a discussion, not a definitive answer.

Update lag:
Update are asynchronous. The best case scenario is that changes are searchable after a few minutes, but it is perfectly normal to have lag up to a few hours, in particular during regular maintenance. We need to ensure that no workflow based on cloudelastic relies on low update lag. One way to do that is to introduce artificial lag, and keep the lag higher than a few hours at all time. An extreme version of this could be to only do weekly refresh of cloudelastic.

Data structure changes:
Overtime, we will refine the way we index documents into elasticsearch. This can include adding or removing fields, changing analyzers configuration or a number of other changes. Tools using cloudelastic should be aware that there will be breaking changes. We might want to define and document a subset of fields that we consider stable.

Elasticsearch upgrades:
Elastic is known to not be afraid of breaking backward compatibility often. Tools will break in unexpected ways during upgrades. We will need to communicate upgrade schedule ahead of time. We will probably not be able to provide an environment to test version N+1, so tool owner will bear the burden of testing the compatibility of their tools. We need a way to forward deprecation warnings from the elasticsearch logs to each tool owner.

Multi cluster:
We currently have 3 elasticsearch clusters, each wiki is mapped to one of those clusters. This mapping should not be considered stable. We need to provide a way for clients to be routed to the correct cluster, or to be able to discover this mapping in an automated way. We should probably restrict cross indices searches since they would only work inside a single cluster.

Quota / rate limiting:
At some point, we will probably need to introduce some form of per user quota or rate limiting. This will require some form of authentication. We should start early on authentication (T220069: Build authenticating reverse proxy for Cloud CirrusSearch replicas), the actual rate limiting can come later.

Read only:
We want updates to only come from production. This cluster should not be used as a generic elasticsearch cluster where anyone can index their own dataset.

Event Timeline

Gehel created this task.Apr 5 2019, 2:08 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 5 2019, 2:08 PM
TJones added a subscriber: TJones.Apr 10 2019, 2:06 PM

This looks good, @Gehel. You brought up of some things we hadn't talked about before, so you covered more than 100% of the topics I had!

Krenair updated the task description. (Show Details)May 24 2019, 10:16 PM
Krenair added a subscriber: Krenair.
EBernhardson triaged this task as Medium priority.May 30 2019, 4:00 PM
EBernhardson moved this task from needs triage to making others happy on the Discovery-Search board.
debt added a subscriber: debt.May 30 2019, 4:02 PM

Moving this to a bit later, once we get more folks using the service