We are running into limits of the elasticsearch architecture, basically we are "full" on indices and can't really create more. Our systems are already over the baselines, with us having to adjust the default master timeout from 5s up to 30s to ensure the daily creation of completion suggesters doesn't fail. Evaluation of adding more indices to the cluster in T192972 showed the cluster having problems placing indices around the cluster even if they were empty.
High level solution:
- Run two jvm's per node in separate clusters
- One large jvm for wikis with shards > 100M
- One small jvm for the remaining wikis
- The small jvm's to be split into two clusters of ~17 nodes each.
- We can almost certainly shrink the large jvm's from their current 30G to some smaller number.
- Estimating small jvm's at 6g, if we can shave a couple g from the large jvm's there should be very little impact on disk cache availability
Looking at our data sizes, roughly 600 primary shards would go to the large jvm's and 2100 primary shards would be split between the two small clusters for 1000 primary shards each. Those 2100 shards represent only 32G of data, or about 100G with replicas, or mean of 3G per server. This is small enough that we shouldn't need any special considerations around data usage between the different elasticsearch instances.
This gets our cluster sizes back into manageable ranges and re-opens the ability to add new indices if it is the right solution to a problem.
- sister-wikis should be entirely within a single cluster
- commonswiki search will need some special considerations
- OtherIndex has to write to a different cluster at times
- Configuration to assign small wikis and sister wikis to appropriate places without spelling out each and every wiki. Or maybe we do spell it out with a dblist?
- This certainly adds operational complexity
- Probably more