🟪️ Monitor ElasticSearch heap utilisation in real time
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Tarrow
	Jun 2 2022, 11:24 AM

Description

On the balance of probabilities running out of heap space seems to be the limiting factor to scaling up the number of Wikis that are managed by our ElasticSearch cluster.
If we know how much heap we are using we can "add more" (by increasing the limit on the pod and the heap allocation) before we totally exhaust our resources.

During task breakdown we suggested a good implementation could be:

Create a cronjob to log the heap percentage
Create a log based metric to parse these log lines (first in the UI and then store this thing in TF in git)

Related Objects
Search...

		Status	Subtype	Assigned	Task
		Resolved		Tarrow	T309019 🟪️ [EPIC] Understand and solve (short term) Elasticsearch issues on Wikibase Cloud
		Resolved		Tarrow	T309776 🟪️ Monitor ElasticSearch heap utilisation in real time

Event Timeline

Tarrow created this task.Jun 2 2022, 11:24 AM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 2 2022, 11:24 AM

Tarrow updated the task description. (Show Details)Jun 2 2022, 11:31 AM

• toan claimed this task.Jun 2 2022, 3:36 PM

• toan moved this task from Backlog to Doing on the Wikibase Cloud (Launch Migration Kanban (2022)) board.

WMDE-leszek renamed this task from Monitor ElasticSearch heap utilisation in real time to 🟪️ Monitor ElasticSearch heap utilisation in real time.Jun 2 2022, 4:06 PM

https://cloud.google.com/monitoring/agent/ops-agent/third-party/elasticsearch#prerequisites but only supports 7.9 :/

https://github.com/wmde/wbaas-deploy/pull/372

• toan removed • toan as the assignee of this task.Jun 3 2022, 3:17 PM

• toan moved this task from Doing to Review on the Wikibase Cloud (Launch Migration Kanban (2022)) board.

• toan subscribed.

Tarrow moved this task from Review to Deploy To Production on the Wikibase Cloud (Launch Migration Kanban (2022)) board.Jun 7 2022, 11:41 AM

Tarrow mentioned this in T309379: 🟪️ Investigate removing the archive ElasticSearch indices.

Tarrow moved this task from Deploy To Production to Done on the Wikibase Cloud (Launch Migration Kanban (2022)) board.Jun 7 2022, 2:11 PM

Tarrow closed this task as Resolved.Jun 23 2022, 9:12 PM

Tarrow claimed this task.