Page MenuHomePhabricator

Set up OpenSearch instance supporting vector search
Open, Needs TriagePublic

Description

Research would like to start working with OpenSearch, to experiment with vector embeddings. This would be a good opportunity for Search and Research get hands on experience with this kind of search.

I would suggest the following (sub) tasks:

  • create an instance/a cluster
  • set up an index that resembles one of our content indices
  • vector creation:

Event Timeline

Per today's meeting, this is blocked by T407123 and T409941.

Do we need any specific plugins on this instance? At the moment, we're working on a minimal OpenSearch deployment, with no additional plugins, meant for the non-Search use cases.

I can see how it would make sense to have a deployment that matches the Search deployment, it is possible, but would require more work from DPE SRE, and thus more lead time.

Do we need any specific plugins on this instance? At the moment, we're working on a minimal OpenSearch deployment, with no additional plugins, meant for the non-Search use cases.

We'd need a fairly recent opensearch version (ideally 3.3) with the plugins:

  • opensearch-ml
  • opensearch-knn

We want to complete the current deployment of OpenSearch on k8s before changing too much (T362105), and also complete our first use case (T357753).

DPE SRE is quite busy at the moment, I doubt that we can move this forward before January 2026.

A temporary 3-node cluster has been stood up in T410681. This is running opensearch 3.3.2 and is accessible from the analytics network (stat machines, hadoop, etc.).

(ebernhardson@stat1008)-~$ curl http://relforge1009.eqiad.wmnet:9200/
{
  "name" : "relforge1009.eqiad.wmnet",
  "cluster_name" : "opensearch-relforge",
  "cluster_uuid" : "8uegbMTJQOS83recXfCijg",
  "version" : {
    "distribution" : "opensearch",
    "number" : "3.3.2",
    "build_type" : "tar",
    "build_hash" : "6564992150e26aaa62d4522a220dfff5188aeb88",
    "build_date" : "2025-10-29T22:24:07.450919802Z",
    "build_snapshot" : false,
    "lucene_version" : "10.3.1",
    "minimum_wire_compatibility_version" : "2.19.0",
    "minimum_index_compatibility_version" : "2.0.0"
  },
  "tagline" : "The OpenSearch Project: https://opensearch.org/"
}

@fkaelin, the OpenSearch instance is ready. Is that enough to get started?