Page MenuHomePhabricator

Deploy a test opensearch cluster on DSE k8s, using opensearch operator
Closed, ResolvedPublic

Description

AC

Event Timeline

Gehel updated the task description. (Show Details)
Gehel updated the task description. (Show Details)
bking changed the task status from Open to In Progress.Oct 7 2025, 1:41 PM
bking claimed this task.
bking triaged this task as Medium priority.
bking updated the task description. (Show Details)
bking updated the task description. (Show Details)
bking updated the task description. (Show Details)

We have a working OpenSearch cluster in dse-k8s-eqiad:

kubectl get po
NAME                                                              READY   STATUS      RESTARTS   AGE
cluster-masters-0                                                 1/1     Running     0          157m
cluster-masters-1                                                 1/1     Running     0          152m
cluster-masters-2                                                 1/1     Running     0          151m
cluster-securityconfig-update-w64t5                               0/1     Completed   0          157m
operator-opensearch-operator-controller-manager-54cc5d7965flgrc   1/1     Running     0          2d2h

The next steps will be to work on TLS and ingress.

Change #1196412 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/deployment-charts@master] Deploy the opensearch-operator to the opensearch-test namespace

https://gerrit.wikimedia.org/r/1196412

Change #1196412 merged by jenkins-bot:

[operations/deployment-charts@master] Deploy the opensearch-operator to the opensearch-test namespace

https://gerrit.wikimedia.org/r/1196412

Change #1196433 had a related patch set uploaded (by Brouberol; author: Brouberol):

[operations/deployment-charts@master] Move operator related common values away from services values and into admin_ng

https://gerrit.wikimedia.org/r/1196433

Change #1196433 merged by Brouberol:

[operations/deployment-charts@master] Move operator related common values away from services values and into admin_ng

https://gerrit.wikimedia.org/r/1196433

Change #1196497 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/deployment-charts@master] Change the name of the cluster deployed in the opensearch-test namespace

https://gerrit.wikimedia.org/r/1196497

Change #1196497 merged by jenkins-bot:

[operations/deployment-charts@master] Change the name of the cluster deployed in the opensearch-test namespace

https://gerrit.wikimedia.org/r/1196497

I've loaded some test data into the cluster using a modified version of this blog post .

Specifically, I:

  • Grabbed a production mapping via bking@cumin2002:~$ curl -s https://search.discovery.wmnet:9243/enwikibooks/_mapping?pretty
  • Removed all the custom analyzers from the json blob via ChatGPT. Note that I only operated on the content mapping (there are also general and archive mappings in the json).
  • Uploaded the mapping to the test instance curl -u ${PW} -H 'Content-Type: application/json' -XPUT $es/$index -d @mapping.json
  • Followed step 3 and 4 from the blog post (split the dump, then PUT the data via OpenSearch bulk API

Verified that that data is now searchable:

Request:

curl -u ${PW}   -X GET "$es/enwikibooks/_search?pretty"   -H 'Content-Type: application/json'   -d '{
    "query": {
      "match": {
        "all": "mint leaves"
      }
    },
    "_source": ["title", "opening_text", "auxiliary_text"]
  }'

Response:

  "took" : 515,                                                                                                       [140/2315]
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 4624,
      "relation" : "eq"
    },
    "max_score" : 18.301796,
    "hits" : [
      {
        "_index" : "enwikibooks",
        "_id" : "424726",
        "_score" : 18.301796,
        "_source" : {
          "opening_text" : "Cookbook | Recipes | Ingredients | Equipment | Techniques | Cookbook Disambiguation Pages | Recipes
There are two ways to make this mint infusion. You can either use dried up mint leaves or fresh mint leaves.",
          "auxiliary_text" : [
            "Mint Tea Category Beverage recipes Time 3–10 minutes Difficulty"
          ],
...

One thing I also noticed is that the audit index is already very large:

curl -u ${PW} -s ${es}/_cat/indices
green open security-auditlog-2025.10.21 o7CJelhPTyGRRHLC3RF0Sw 1 1    545   0   1.5gb 779.5mb

It also seemed to get quite a bit larger with every API call (I was running a for loop against 500 chunks of index). We don't want to max out our disk space due to audit logging, so we need to figure out a plan.

bking updated the task description. (Show Details)