Page MenuHomePhabricator

Recompile/repackage elasticsearch-madvise for Opensearch
Closed, ResolvedPublic

Description

In T264053, @EBernhardson created a small C program, elasticsearch-madvise, to reduce kernel readahead settings. This greatly reduced I/O on our Elastic hosts. We'll need to adapt this for Opensearch.

Creating this ticket to:

  • Create a new repo for opensearch-madvise
  • Package/distribute opensearch-madvise
  • Confirm operation of opensearch-madvise

Details

Related Changes in Gerrit:
Related Changes in GitLab:
TitleReferenceAuthorSource BranchDest Branch
Add license and update changelogrepos/search-platform/opensearch-madvise!2bkingupdate-changelogmain
opensearch-madvise: rename from elasticsearch-madviserepos/search-platform/opensearch-madvise!1bkingrename-to-opensearchmain
Customize query in GitLab

Event Timeline

Mentioned in SAL (#wikimedia-operations) [2025-02-21T18:11:03Z] <inflatador> bking@apt1002:~$ sudo -E reprepro --ignore=wrongdistribution -C component/opensearch13 include bullseye-wikimedia $HOME/madvise-pkg/opensearch-madvise_0.1_amd64.changes T387030

Change #1121671 had a related patch set uploaded (by Bking; author: Bking):

[operations/puppet@production] relforge: re-enable opensearch-madvise

https://gerrit.wikimedia.org/r/1121671

Change #1121671 merged by Bking:

[operations/puppet@production] relforge: re-enable opensearch-madvise

https://gerrit.wikimedia.org/r/1121671

Mentioned in SAL (#wikimedia-operations) [2025-02-21T18:51:37Z] <bking@cumin2002> START - Cookbook sre.elasticsearch.ban Banning hosts: relforge1004* for test ability to ban opensearch node - bking@cumin2002 - T387030

Mentioned in SAL (#wikimedia-operations) [2025-02-21T18:51:40Z] <bking@cumin2002> END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: relforge1004* for test ability to ban opensearch node - bking@cumin2002 - T387030

Change #1121693 had a related patch set uploaded (by Bking; author: Bking):

[operations/puppet@production] cirrus: point bash script to the correct executable

https://gerrit.wikimedia.org/r/1121693

Change #1121693 merged by Bking:

[operations/puppet@production] cirrus: point bash script to the correct executable

https://gerrit.wikimedia.org/r/1121693

The new opensearch-madvise package is deployed on relforge1004 and I confirmed that it works:

journalctl -u opensearch-disable-readahead.service | tac 

Feb 21 22:02:00 relforge1004 systemd[1]: Finished Disables readahead on all open files every 30 minutes to alleviate Cirrussearc
h / opensearch IO load spikes.
Feb 21 22:02:00 relforge1004 systemd[1]: opensearch-disable-readahead.service: Succeeded.
Feb 21 22:02:00 relforge1004 opensearch-disable-readahead.sh[32748]: Done
Feb 21 22:02:00 relforge1004 opensearch-disable-readahead.sh[32748]: + echo Done
Feb 21 22:02:00 relforge1004 opensearch-disable-readahead.sh[32752]: breaking pid 1131
Feb 21 22:02:00 relforge1004 opensearch-disable-readahead.sh[32748]: + /usr/bin/opensearch-madvise 1131
Feb 21 22:02:00 relforge1004 opensearch-disable-readahead.sh[32751]: ++ cat /run/opensearch-relforge-eqiad-small-alpha/relforge-eqiad-small-alpha.pid
Feb 21 22:02:00 relforge1004 opensearch-disable-readahead.sh[32748]: + for f in /run/opensearch*/*.pid
Feb 21 22:02:00 relforge1004 opensearch-disable-readahead.sh[32750]: success : /srv/opensearch/relforge-eqiad/nodes/0/indices/dzGl-WIzSrmFCRO2sBHXHA/3/index/_0.cfs

As such, I'm closing out this ticket.

Mentioned in SAL (#wikimedia-operations) [2025-02-24T10:44:56Z] <brouberol@cumin2002> START - Cookbook sre.elasticsearch.ban Banning hosts: relforge1005* for test ability to ban opensearch node - brouberol@cumin2002 - T387030

Mentioned in SAL (#wikimedia-operations) [2025-02-24T10:45:00Z] <brouberol@cumin2002> END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: relforge1005* for test ability to ban opensearch node - brouberol@cumin2002 - T387030