User Details
- User Since
- Dec 15 2021, 9:19 PM (217 w, 3 d)
- Availability
- Available
- LDAP User
- Unknown
- MediaWiki User
- BKing (WMF) [ Global Accounts ]
Fri, Feb 13
I've updated opensearch-semantic-search and opensearch-semantic-search-test pods to 8GB/4vCPUs (the opensearch-medium flavor) and redeployed in both DCs. Closing...
Thu, Feb 12
I think we are waiting on DC Ops for this task, so I'm moving to "Tracking" status on our board. @Jclark-ctr or any other DC Ops members, feel free to hit us back here if I'm incorrect about this.
Fixed by gerrit commit I6a6d451369100e9d7cab1a9b239f962b170a3b8f
We should check out the options for user/role management in the newest version of the chart before we do any work here. I've created T417328 for this purpose.
Wed, Feb 11
Re-opening, as using the 2.8 chart will be a requirement for OpenSearch operator 3 project
Re-opening, as using the 2.8 chart will be a requirement for OpenSearch operator 3 project
Re-opening, as using the 2.8 chart will be a requirement for OpenSearch operator 3 project
Re: emptyDir (local storage)
Tue, Feb 10
Update: One of the OpenSearch pods has significantly higher search latency than the other 2. The slower pod is on 1Gbps network host.
The new opensearch backend rolled out today from ~1145-1245 UTC. It had to be reverted due to timeouts.
Mon, Feb 9
For reference:
- OpenSearch docs for configuring mTLS (aka client certificate authentication)
- Previous change that created a separate OpenSearch user for index operations.
@Jclark-ctr sorry for the confusion, we recycled some old Elastic hosts into WDQS hosts in T409769 and T409769. We did this after asking for an expansion in T405276 , which threw off the numbering scheme in that request. I can fix that if you like.
Balthazar already mentioned our (as in Data-Platform-SRE ) interest, but it would also be just as handy for OpenSearch on K8s as it would be for Airflow, for exactly the same reasons. We are happy to help experiment when the time is right.
There are a couple of unfinished operations on this board:
Change number of replicas: The current design provisions exact 3 pods, so we don't really need to test this.
Drain a Kubernetes node and check that the OS cluster tolerates it nicely: We have existing workarounds for this if need be (delete a pod, storage is not deleted so downtime is minimal).
Contrary to my last statement, we no longer need this dashboard for Mutualized OpenSearch (now called OpenSearch on Kubernetes) as it already has its own dashboard .
Fixed by gerrit commit I6a6d451369100e9d7cab1a9b239f962b170a3b8f
Fixed by gerrit commit I6a6d451369100e9d7cab1a9b239f962b170a3b8f
Fixed by gerrit commit I6a6d451369100e9d7cab1a9b239f962b170a3b8f
Fixed by gerrit commit I6a6d451369100e9d7cab1a9b239f962b170a3b8f
Thanks @hashar , I must have had my brain dripping out of my ears when I wrote that last update. As you responded, docker-pkg is actively maintained so there is no reason to pull that thread. I would ask to add a link back to the docker-pkg repo in that same infobox, happy to make a PR for that.
Fri, Feb 6
I've created another dashboard, DSE K8s Blackbox probes , to aid the investigation.
I've deployed opensearch-semantic-search as promised. All test/non-production clusters are on the latest trixie-based OpenSearch 3 image. Closing...
I agree with @MLechvien-WMF that this is a nice-to-have only, but I would also like to add that I lost a lot of time Monday from duplicating a docker image name (see this ugly series of commits ). Although my changes were reviewed by several people, it took @RLazarus 's sharp eyes to finally spot the problem.
Thanks Effie! I'll spin out a subtask for DPE-owned charts.
@brouberol Per your request, the following clusters are now on the docker-registry.wikimedia.org/repos/data-engineering/opensearch:2026-02-03-212519-70e19086545c5f33b67dde8506c9ffea81761132-production3@sha256:6e746ee4dad5f3fbd5f6c671fa44b7964f5b516c994d4990842bc5eed7839579 image:
- opensearch-test (eqiad,codfw)
- opensearch-semantic-search-test (eqiad, codfw)