Page MenuHomePhabricator

Add alerting on Elasticsearch Shards Reaching Node limit
Closed, ResolvedPublic

Description

Add an alert when it looks likely we will exhaust the shard capacity of our existing data nodes in the next 2 weeks. If it's hard to predict when this will be at a first approximation let's assume this alert should fire at 90% shard usage.

We want this alert so that we don't fully exhaust the shard count and then are unable to make indices for new wikis.

Exceeding this number is very inconvenient since we then have to go back and recreate them.

On this alert being triggered we need to provision a new Elasticsearch data node; we should also check at this time that we aren't seeing the master node using more than 75% of their heap. If they are we need to add more Memory to the master nodes.

In order to take either of these actions we may need to add more kubernetes nodes to the cluster.

Event Timeline

Fring removed Fring as the assignee of this task.Feb 7 2024, 3:10 PM
Fring moved this task from Doing to In Review on the Wikibase Cloud (Kanban board Q1 2024) board.
Fring subscribed.