Page MenuHomePhabricator

Deploy multi-tenant OpenSearch cluster as replacement for Elasticsearch
Open, MediumPublicFeature

Description

Replace the Toolforge Elasticsearch cluster with an OpenSearch deployment with multi-tenancy enabled. This will improve the isolation of tenants within the service.


Original feature request:

Currently, it looks like anyone with Elasticsearch credentials can make write requests on any index. And unlike toolforge redis, it is possible to list all indices, so using secret index names don't work out either.

Possible solutions:

  • Put elasticsearch behind an HTTP proxy
  • Allow write requests only if index name begins with tool name (or if the index name is legacy - to allow for migration)
  • OR disallow listing of index names (GET /_cat/ calls) so that secret index name prefixes could be used for access control

Event Timeline

Some time ago, when I first started looking at a project which would use ES, I decided I was unable to use the Toolforge ES for exactly the reason @SD0001 describes. I didn't want to ingest my many GB of data only to have somebody delete it all with a single command. Plus I may not want to expose all my data to the world, or even WMF Cloud users.

When I enquired, I was told that due to licensing issues, Toolforge was actually running an elasticsearch fork called opensearch (see http://opensearch.org), but OS didn't support the multitennancy features needed to provide data isolation.

Roll forward a year or so. I've installed OS-2.11, which does have multitennancy support, on my VPS instance. I'd actually love to not be running my own OpenSearch. I suck at linux sysadmin, which the WMF SRE folks are expert at. Actually, I used to be very good at it, but it's been a while since I did it for real, so most of my skills in that department are stale. I'd rather be spending my time doing data analysis. It would be wonderful if a modern version of OpenSearch could be stood up in Toolforge, where I'm sure the WMF staff would do a far better job of administering it than what I've duct-taped together.

@bd808 following up to our conversation earlier today, it turns out this ticket already exists (and I had forgotten about it). I'll just add, per our conversation, that I'd be happy to work with a WMF team to get this up and running, sharing what I learned from my own experiences, and then hand over long-term ownership to the WMF team. If you could socialize this in the right places, I'd appreciate it.

I'm going to be bold and suggest that we rewrite the root task here to be about deploying OpenSearch with multi-tenancy enabled. The reason our current cluster does not have true multi-tenancy is that the elastic.co offerings in that space were non-free (both libre and gratis). The OpenSearch fork has reimplemented these features under a FOSS license which presents us with an opportunity to solve this community feature request without needing to invent things from scratch.

bd808 renamed this task from Add access control for Toolforge Elasticsearch to Deploy multi-tentant OpenSearch cluster as replacement for Elasticsearch.Oct 7 2024, 1:23 PM
bd808 updated the task description. (Show Details)
bd808 changed the subtype of this task from "Task" to "Feature Request".
fnegri renamed this task from Deploy multi-tentant OpenSearch cluster as replacement for Elasticsearch to Deploy multi-tenant OpenSearch cluster as replacement for Elasticsearch.Oct 31 2024, 3:24 PM

Linking T379288 since we might explore security features too for the (upcoming) WMF internal opensearch cluster used for search.