Page MenuHomePhabricator

Fix apifeatureusage index cleanup (curator)
Closed, ResolvedPublic

Description

We are getting alerts for curator_actions_apifeatureusage_codfw.service on apifeatureusage1001:9100 . That service is failing because its version of Curator does not work with OpenSearch:

May 18 00:42:03 apifeatureusage1001 curator[1911181]: 2025-05-18 00:42:03,203 ERROR Elasticsearch version 1.3.20 incompatible with this version of Curator (5.8.1)

We can probably do a quick fix by using the Observability-maintained curator fork that's in our apt repos . The long-term fix (using OpenSearch's built-in lifecycle management) is being discussed in T386525 .

Creating this ticket to:

  • Write Puppet code to use an OpenSearch-compatible version of Curator and apply to apifeatureusage hosts.
  • Confirm operation

Event Timeline

Gehel triaged this task as High priority.May 20 2025, 2:07 PM
Gehel moved this task from Incoming to Scratch on the Data-Platform-SRE board.

Note to self: T301017 has a lot of good info about how to use the Observability fork.

Change #1151687 had a related patch set uploaded (by Bking; author: Bking):

[operations/puppet@production] apifeatureusage: switch to Observability-maintained curator

https://gerrit.wikimedia.org/r/1151687

Change #1151687 merged by Bking:

[operations/puppet@production] apifeatureusage: switch to Observability-maintained curator

https://gerrit.wikimedia.org/r/1151687

Change #1151754 had a related patch set uploaded (by Bking; author: Bking):

[operations/puppet@production] apifeatureusage: switch to Observability-maintained curator, part 2

https://gerrit.wikimedia.org/r/1151754

Change #1151754 merged by Bking:

[operations/puppet@production] apifeatureusage: switch to Observability-maintained curator, part 2

https://gerrit.wikimedia.org/r/1151754

Change #1151775 had a related patch set uploaded (by Bking; author: Bking):

[operations/puppet@production] apifeatureusage: switch to Observability-maintained curator, part 3

https://gerrit.wikimedia.org/r/1151775

Change #1151775 merged by Bking:

[operations/puppet@production] apifeatureusage: switch to Observability-maintained curator, part 3

https://gerrit.wikimedia.org/r/1151775

bking changed the task status from Open to In Progress.May 28 2025, 8:35 PM
bking claimed this task.
bking closed this task as Resolved.EditedMay 29 2025, 4:10 PM

I checked the curator_actions_apifeatureusage systemd units after the patch above was merged, and I can confirm they are no longer in failed state.

I also verified that curator is actually applying its settings to the relevant indices:

curl -s http://search.svc.codfw.wmnet:9200/apifeatureusage-2025.02.28?pretty
{
  "apifeatureusage-2025.02.28" : {
    "aliases" : { },
    "mappings" : {
      "dynamic_templates" : [
        {
          "string_fields" : {
            "match" : "*",
            "match_mapping_type" : "string",
            "mapping" : {
              "index" : false
            }
          }
        }
      ],
      "properties" : {
        "@timestamp" : {
          "type" : "date"
        },
        "@version" : {
          "type" : "keyword"
        },
        "agent" : {
          "type" : "keyword"
        },
        "feature" : {
          "type" : "keyword"
        },
        "type" : {
          "type" : "text",
          "index" : false
        }
      }
    },
    "settings" : {
      "index" : {
        "refresh_interval" : "5s",
        "number_of_shards" : "1",
        "provided_name" : "apifeatureusage-2025.02.28",
        "creation_date" : "1740700800255",
        "analysis" : {
          "analyzer" : {
            "default" : {
              "type" : "standard",
              "stopwords" : "_none_"
            }
          }
        },
        "number_of_replicas" : "1",
        "uuid" : "2PGhFkNLQEKCZO1OoC-eaQ",
        "version" : {
          "created" : "7100299"
        }
      }
    }
  }
}

As such, I'm closing out this ticket.