Page MenuHomePhabricator

Build and deploy updated opensearch plugins deb
Closed, ResolvedPublic

Description

The opensearch plugins found in gerrit at operations/software/opensearch/plugins has been updated to include the opensearch-knn plugin.

AC:

  • Updated package, version 1.3.20-3, build and shipped to apt.wikimedia.org
  • Package installed to relevant clusters (relforge, cloudelastic) via rolling restart cookbook

Details

Other Assignee
RKemper
Related Changes in Gerrit:

Event Timeline

Wasn't sure if it should be part of the AC, but once the package is built we can also merge https://gitlab.wikimedia.org/repos/search-platform/cirrussearch-opensearch-image/-/merge_requests/13 which updates the container used in testing/local dev to include the new package.

bking changed the task status from Open to In Progress.Mar 27 2025, 2:34 PM
bking claimed this task.
bking updated Other Assignee, added: RKemper.

Mentioned in SAL (#wikimedia-operations) [2025-03-28T15:29:29Z] <inflatador> bking@apt1002 publish wmf-opensearch-search-plugins-1.3.20-3 to component/opensearch13 bullseye-wikimedia T390100

Mentioned in SAL (#wikimedia-operations) [2025-03-28T16:44:32Z] <bking@cumin2002> START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster relforge: apply updated master config - bking@cumin2002 - T390100

Mentioned in SAL (#wikimedia-operations) [2025-03-28T16:44:37Z] <bking@cumin2002> END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster relforge: apply updated master config - bking@cumin2002 - T390100

Mentioned in SAL (#wikimedia-operations) [2025-03-28T19:31:41Z] <bking@cumin2002> START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster relforge: apply updated master config - bking@cumin2002 - T390100

Mentioned in SAL (#wikimedia-operations) [2025-03-28T19:31:45Z] <bking@cumin2002> END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster relforge: apply updated master config - bking@cumin2002 - T390100

Change #1132024 had a related patch set uploaded (by Bking; author: Bking):

[operations/puppet@production] relforge: enable rack awareness

https://gerrit.wikimedia.org/r/1132024

Change #1132024 merged by Bking:

[operations/puppet@production] relforge: enable rack awareness

https://gerrit.wikimedia.org/r/1132024

Mentioned in SAL (#wikimedia-operations) [2025-03-31T13:19:15Z] <bking@cumin2002> START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster relforge: apply updated master config - bking@cumin2002 - T390100

Mentioned in SAL (#wikimedia-operations) [2025-03-31T13:19:20Z] <bking@cumin2002> END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster relforge: apply updated master config - bking@cumin2002 - T390100

Mentioned in SAL (#wikimedia-operations) [2025-03-31T13:22:30Z] <bking@cumin2002> START - Cookbook sre.elasticsearch.ban Banning hosts: relforge1003* for ban relforge1003 prior to service restart - bking@cumin2002 - T390100

Mentioned in SAL (#wikimedia-operations) [2025-03-31T13:22:35Z] <bking@cumin2002> END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: relforge1003* for ban relforge1003 prior to service restart - bking@cumin2002 - T390100

Mentioned in SAL (#wikimedia-operations) [2025-03-31T14:01:56Z] <bking@cumin2002> START - Cookbook sre.elasticsearch.ban Banning hosts: relforge1004* for ban relforge1004 prior to service restart and decom T390565 - bking@cumin2002 - T390100

Mentioned in SAL (#wikimedia-operations) [2025-03-31T14:02:02Z] <bking@cumin2002> END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: relforge1004* for ban relforge1004 prior to service restart and decom T390565 - bking@cumin2002 - T390100

Mentioned in SAL (#wikimedia-operations) [2025-03-31T14:08:34Z] <bking@cumin2002> START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster cloudelastic: apply new plugins - bking@cumin2002 - T390100

Mentioned in SAL (#wikimedia-operations) [2025-03-31T14:45:43Z] <bking@cumin2002> END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster cloudelastic: apply new plugins - bking@cumin2002 - T390100

The packages are deployed, but Relforge/Cloudelastic have not yet been restarted. Working on that now...

Mentioned in SAL (#wikimedia-operations) [2025-04-30T16:59:55Z] <bking@cumin2002> START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: apply new madvise and row/rack awareness T391392 T390100 - bking@cumin2002 - T390100

Mentioned in SAL (#wikimedia-operations) [2025-04-30T17:27:51Z] <bking@cumin2002> END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: apply new madvise and row/rack awareness T391392 T390100 - bking@cumin2002 - T390100

Mentioned in SAL (#wikimedia-operations) [2025-04-30T17:32:57Z] <bking@cumin2002> START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: apply new madvise pkg as I forgot last time T390100 - bking@cumin2002 - T390100

Mentioned in SAL (#wikimedia-operations) [2025-04-30T18:00:39Z] <bking@cumin2002> END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: apply new madvise pkg as I forgot last time T390100 - bking@cumin2002 - T390100

Mentioned in SAL (#wikimedia-operations) [2025-04-30T21:11:10Z] <bking@cumin2002> START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: apply new knn plugin - bking@cumin2002 - T390100

Mentioned in SAL (#wikimedia-operations) [2025-04-30T21:11:15Z] <bking@cumin2002> END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: apply new knn plugin - bking@cumin2002 - T390100

Mentioned in SAL (#wikimedia-operations) [2025-04-30T21:12:20Z] <bking@cumin2002> START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: apply new wmf-opensearch-search-plugins version 1.3.20-4~bullseye - bking@cumin2002 - T390100

Mentioned in SAL (#wikimedia-operations) [2025-04-30T21:15:12Z] <bking@cumin2002> START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster relforge: apply new wmf-opensearch-search-plugins version 1.3.20-4~bullseye - bking@cumin2002 - T390100

Mentioned in SAL (#wikimedia-operations) [2025-04-30T21:15:16Z] <bking@cumin2002> END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster relforge: apply new wmf-opensearch-search-plugins version 1.3.20-4~bullseye - bking@cumin2002 - T390100

The opensearch-knn plugin is deployed across relforge, cloudelastic, and prod CODFW:

curl -s https://relforge1008.eqiad.wmnet:9243/_cat/plugins  | grep knn
relforge1004-relforge-eqiad opensearch-knn                      1.3.20.0
relforge1003-relforge-eqiad opensearch-knn                      1.3.20.0
relforge1008-relforge-eqiad opensearch-knn                      1.3.20.0
relforge1009-relforge-eqiad opensearch-knn                      1.3.20.0
for n in 9243 9443 9643; do curl -s -XGET https://cloudelastic.wikimedia.org:${n}/_cat/plugins | grep knn | wc -l; done
6
6
6
for n in 9243 9443 9643; do curl -s -XGET https://search.svc.codfw.wmnet:${n}/_cat/plugins | grep knn | wc -l; done
60
30
30

As such, I'm closing this ticket. Please feel free to reopen if we missed something.

Mentioned in SAL (#wikimedia-operations) [2025-04-30T21:40:09Z] <bking@cumin2002> END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: apply new wmf-opensearch-search-plugins version 1.3.20-4~bullseye - bking@cumin2002 - T390100