We receive periodic alerts based about the root volume on hadoop workers becoming full.
For example:
btullis@an-worker1092:~$ df -h / Filesystem Size Used Avail Use% Mounted on /dev/mapper/an--worker1092--vg-root 55G 49G 3.2G 94% /
Upon investigation, we find that most of the space is taken up in the /tmp directory.
--- / -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
36.2 GiB [##########] /tmp
6.3 GiB [# ] /var
4.5 GiB [# ] /usr
1.6 GiB [ ] /opt
8.4 MiB [ ] /etc
7.2 MiB [ ] /home
36.0 KiB [ ] /rootOf this used space in /tmp, the vast majority of it is consumed by jar files.
btullis@an-worker1092:~$ sudo find /tmp -name *.jar -exec du -ch {} + | grep total$
34G totalWhat is more, most of those are more than 30 days old.
btullis@an-worker1092:~$ sudo find /tmp -name *.jar -mtime +30 -exec du -ch {} + | grep total$
32G totalMany different jars are affected, although different versions of refinery-hive make up the vast majority of them.
btullis@an-worker1092:~$ sudo find /tmp -name *.jar -mtime +30 | awk -F / '{print $NF}'|sort|uniq -c|sort -rn|head -n 20
413 refinery-hive-0.2.1-shaded.jar
136 refinery-hive-0.2.54-shaded.jar
97 refinery-hive-0.2.48-shaded.jar
96 refinery-hive-0.2.42-shaded.jar
51 refinery-hive-0.2.30-shaded.jar
17 refinery-hive-0.0.91-SNAPSHOT.jar
12 refinery-hive-0.2.59-shaded.jar
7 refinery-job-0.2.1-shaded.jar
7 org.apache.iceberg_iceberg-spark-runtime-3.3_2.12-1.6.1.jar
7 iceberg-spark-runtime-3.3_2.12-1.6.1.jar
6 unused-1.0.0.jar
6 refinery-job-0.2.54-SNAPSHOT-shaded.jar
6 org.spark-project.spark_unused-1.0.0.jar
4 mysql-connector-j-8.2.0.jar
3 zstd-jni-1.4.8-1.jar
3 spark-token-provider-kafka-0-10_2.12-3.1.2.jar
3 spark-sql-kafka-0-10_2.12-3.1.2.jar
3 spark-avro_2.12-3.1.2.jar
3 snappy-java-1.1.8.2.jar
3 snakeyaml-1.26.jarIt would be good if we could work out how best to prevent this gradual build-up of jar files in /tmp on the an-worker nodes.