We have recently seen a number of alerts from Hadoop worker nodes that are experiencing low disk space conditions on their root partitions.
This is causing the hadoop-yarn-nodemanager processes to crash due to being unable to allocate any space.
See here for report emails: https://groups.google.com/a/wikimedia.org/g/data-platform-alerts/search?q=nodemanager%20critical%20after%3A2025-01-09
We can see that, for example an-worker1154 suddenly used up 20% of the space in / at which point yarn crashed.