Enable GC (garbage collection) logs on Elasticsearch JVM
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Gehel
	May 10 2016, 9:12 AM

Description

Recent issue on Elasticsearch indicates a GC overload. Collecting GC logs would help diagnose this kind of issue if it ever happens again. Some care needs to be taken around log rotation (GC logging is overly optimized and creates a few issues for log rotation).

Details

	Subject	Repo	Branch	Lines +/-
	elasticsearch - enable GC logs by default	operations/puppet	production	+2 -5
	elasticsearch - enable garbage collection logs on relforge servers	operations/puppet	production	+63 -29

Customize query in gerrit

Related Objects
Search...

		Status	Subtype	Assigned	Task
		Resolved		debt	T134829 Followup on elastic1026 blowing up May 9, 21:43-22:14 UTC
		Resolved		Gehel	T134853 Enable GC (garbage collection) logs on Elasticsearch JVM

Event Timeline

Gehel created this task.May 10 2016, 9:12 AM

debt renamed this task from Enable GC logs on Elasticsearch JVM to Enable GC (garbage collection) logs on Elasticsearch JVM.Jun 16 2016, 10:05 PM

Dzahn unsubscribed.Jun 21 2016, 3:42 AM

moving to the backlog board until we have more time to look at this.

greg added a project: Wikimedia-Incident.Jul 27 2016, 10:53 PM

greg moved this task from Active investigation to Follow-up prevention on the Wikimedia-Incident board.

This follow-up task from an incident report has not been updated recently. If it is no longer valid, please add a comment explaining why. If it is still valid, please prioritize it appropriately relative to your other work. If you have any questions, feel free to ask me (Greg Grossmeier).

Hi @Gehel and @EBernhardson - can we take a look at this to see if it's still valid?

@greg, the answer to your concerns might have to wait until early next week when @Gehel returns from his offsite travels, but thanks for bringing it up!

Let's go ahead and start working on this next.

ArielGlenn subscribed.Oct 24 2016, 8:57 AM

debt assigned this task to Gehel.Oct 25 2016, 5:46 PM

debt edited projects, added Discovery-Search (Current work); removed Discovery-Search.

Change 318055 had a related patch set uploaded (by Gehel):
elasticsearch - enable garbage collection logs on relforge servers

https://gerrit.wikimedia.org/r/318055

gerritbot added a project: Patch-For-Review.Oct 26 2016, 9:03 AM

Gehel moved this task from Incoming to Needs review on the Discovery-Search (Current work) board.Oct 26 2016, 9:36 AM

Change 318055 merged by Gehel:
elasticsearch - enable garbage collection logs on relforge servers

https://gerrit.wikimedia.org/r/318055

Gehel moved this task from Needs review to Needs Reporting on the Discovery-Search (Current work) board.Oct 27 2016, 12:04 PM

Mentioned in SAL (#wikimedia-operations) [2016-10-27T12:04:25Z] <gehel> restart elasticsearch on relforge to activate GC logs - T134853

Puppet change is deployed. GC logs are available on relforge. I will wait a few days to check everything works fine before activating them on production clusters as well.

Change 318353 had a related patch set uploaded (by Gehel):
elasticsearch - enable GC logs by default

https://gerrit.wikimedia.org/r/318353

I will assume that @Gehel's patch resolves this issue. Please reopen if there are any issues. :-)

The patch *should* resolve the issue, but it is not yet deployed. So at this point GC logs are enabled on relforge cluster, but not anywhere else. I'm reopening this and will close it for real fairly soon.

Change 318353 merged by Gehel:
elasticsearch - enable GC logs by default

https://gerrit.wikimedia.org/r/318353

Gehel closed this task as Resolved.Nov 1 2016, 9:33 AM

@Gehel Maybe we can share learnings about GC and which GC to use in particular over here where we are trying to optimize Gerrit in T148478 T148478#2750773 T148478#2760146 Are you using G1?

fwiw, here's where we enabled GC logging https://gerrit.wikimedia.org/r/#/c/317582/ https://gerrit.wikimedia.org/r/#/c/318067/

Krinkle edited projects, added Sustainability (Incident Followup); removed Wikimedia-Incident.Apr 28 2020, 9:50 PM

Enable GC (garbage collection) logs on Elasticsearch JVMClosed, ResolvedPublicActions

Description

Details

Related ObjectsSearch...

Event Timeline

Enable GC (garbage collection) logs on Elasticsearch JVM
Closed, ResolvedPublic
Actions

Related Objects
Search...