Page MenuHomePhabricator

Resolve repeated GC alerts from cloudelastic
Closed, ResolvedPublic

Description

Cloudelastic has started to alert more regularly over the last month first on march 3rd, then nothing for awhile, and now alerts on the 23rd, 24th, 29th, 30th and the 1st. The alerts come from psi and omega, the small jvms. In particular looking at the graphs we can see that just before the beginning of april we started running out of memory. The old pool filled up and the survivor pool hasn't been used since. The young pool retained enough working space to continue operation, but it wont be sustainable.

Plausibly we should either reduce memory usage (T279009 would likely help, cut index/shard counts in half) or increase jvm heap size.

Event Timeline

I suppose the alternate step 1 is restart the jvm's and see if it happens again (it usually does).

I'm expecting that the work done in T279009 which freed up some heap on the smaller cloudelastic clusters will resolve the gc alerts we've seen recently. Moving to waiting for now, can close if they don't start coming back up as we trying reindexing again.

We don't seem to be having problems with this, not getting alerts. Calling it complete.

Gehel claimed this task.