Page MenuHomePhabricator

Test and possibly raise the Xmx/Xms settings for the Hadoop Yarn Namenode and HDFS datanode daemons
Closed, ResolvedPublic

Description

HDFS Datanodes and Yarn Nodemanager daemons are running on each hadoop worker and consuming constantly almost all the allocated memory available, often ending up in old gen garbage collection. The 2G Xmx/Xms limit is probably too old/inadequate for the actual workload of the cluster, so some tuning is needed.

Event Timeline

Change 386147 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] hadoop: raise Xmx/Xms settings for hadoop worker daemons on an1030

https://gerrit.wikimedia.org/r/386147

Change 386147 merged by Elukey:
[operations/puppet@production] hadoop: raise Xmx/Xms settings for hadoop worker daemons on an1030

https://gerrit.wikimedia.org/r/386147

Mentioned in SAL (#wikimedia-operations) [2017-10-25T13:30:00Z] <elukey> restart yarn nodemanager and hdfs datanode on analytics1030 to apply new JVM settings - T178876

Change 390237 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] hadoop: raise jvm heap sizes for HDFS datanode and Yarn daemons

https://gerrit.wikimedia.org/r/390237

elukey changed the task status from Open to Stalled.Nov 24 2017, 3:24 PM
elukey moved this task from Next Up to In Progress on the Analytics-Kanban board.
elukey moved this task from In Progress to Paused on the Analytics-Kanban board.

This is going to be done but it needs to wait for the next round of restarts of the JVM.

Change 394256 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] profile::hadoop::worker: increase Java Xmx to 4G for Datanode/Nodemanager

https://gerrit.wikimedia.org/r/394256

Change 394256 merged by Elukey:
[operations/puppet@production] profile::hadoop::worker: increase Java Xmx to 4G for Datanode/Nodemanager

https://gerrit.wikimedia.org/r/394256

Change 390237 abandoned by Elukey:
hadoop: raise jvm heap sizes for HDFS datanode and Yarn daemons

https://gerrit.wikimedia.org/r/390237