Page MenuHomePhabricator

Build updated opensearch-madvise .deb and update puppet with new cli argument
Closed, ResolvedPublic

Description

opensearch-madvise has been updated to take the path to operate on as a command line argument. This has a slight change to the cli arguments and will require a change in puppet as well.

Before: opensearch-madvise <pid>
After: opensearch-madvise <pid> <data path>

Event Timeline

Change #1132692 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[operations/puppet@production] Update opensearch-madvise call for version 0.2

https://gerrit.wikimedia.org/r/1132692

Change #1132692 merged by Bking:

[operations/puppet@production] Update opensearch-madvise call for version 0.2

https://gerrit.wikimedia.org/r/1132692

I built the package and deployed Puppet. The binary works, but the Puppet code is still complaining:

Error: Could not retrieve catalog from remote server: Error 500 on SERVER: Server Error: Evaluation Error: Error while evaluating a Resource Statement, Systemd::Timer::Job[opensearch-disable-readahead]:
  expects a value for parameter 'interval'
  expects a value for parameter 'description'
  expects a value for parameter 'command'
  expects a value for parameter 'user' (file: /srv/puppet_code/environments/production/modules/profile/manifests/opensearch/cirrus/server.pp, line: 109) on node relforge1003.eqiad.wmnet
Warning: Not using cache on failed catalog
Error: Could not retrieve catalog; skipping run

I'm temporarily reverting this change and we can work on it more tomorrow.

Change #1139888 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[operations/puppet@production] Revert "Revert "Update opensearch-madvise call for version 0.2""

https://gerrit.wikimedia.org/r/1139888

Change #1139888 merged by Bking:

[operations/puppet@production] Revert^2 "Update opensearch-madvise call for version 0.2"

https://gerrit.wikimedia.org/r/1139888

Change #1140199 had a related patch set uploaded (by Bking; author: Bking):

[operations/puppet@production] cirrussearch: fix typo in systemd timer resource

https://gerrit.wikimedia.org/r/1140199

Change #1140199 merged by Bking:

[operations/puppet@production] cirrussearch: fix typo in systemd timer resource

https://gerrit.wikimedia.org/r/1140199

We fixed the Puppet code, but opensearch-disable-readahead-relforge-eqiad.service can't start.
The puppet code that creates the unit looks correct , but the ExecStart is missing the second argument (provided by the {base_data_dir} variable:

root@relforge1008:~# systemctl cat opensearch-disable-readahead-relforge-eqiad.service
ExecStart=/usr/local/bin/opensearch-disable-readahead.sh relforge-eqiad

Theoretically, it should be possible to look up this value via PCC or a lookup on the Puppet server. I'm out of time for today, but will revisit tomorrow.

bking changed the task status from Open to In Progress.Apr 30 2025, 9:48 PM
bking claimed this task.
bking updated Other Assignee, removed: bking.

Change #1140741 had a related patch set uploaded (by Bking; author: Ebernhardson):

[operations/puppet@production] opensearch: Provide expected base_data_dir to readahead disable

https://gerrit.wikimedia.org/r/1140741

Change #1140741 merged by Bking:

[operations/puppet@production] opensearch: Provide expected base_data_dir to readahead disable

https://gerrit.wikimedia.org/r/1140741

I merged @EBernhardson 's patch above and ran puppet on cirrussearch2100. After starting opensearch-disable-readahead-production-search-omega-codfw.service and opensearch-disable-readahead-production-search-codfw.service , I can confirm that the service is working as expected. Closing...