Page MenuHomePhabricator

Create HDFS /tmp/ cleaner
Closed, ResolvedPublic5 Estimated Story Points

Description

Hadoop doesn't do this wth!

We should make a hadoop cleaner job that uses the Hadoop Java API to look for files with old mtimes.

Related Objects

StatusSubtypeAssignedTask
ResolvedNone
ResolvedOttomata

Event Timeline

Ottomata created this task.
Ottomata moved this task from Incoming to Operational Excellence on the Analytics board.

Change 543897 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[analytics/refinery/source@master] Add HDFSCleaner to aid in cleaning HDFS tmp directories

https://gerrit.wikimedia.org/r/543897

Change 543897 merged by Ottomata:
[analytics/refinery/source@master] Add HDFSCleaner to aid in cleaning HDFS tmp directories

https://gerrit.wikimedia.org/r/543897

Ottomata set the point value for this task to 5.Oct 30 2019, 2:31 PM

Change 548468 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Include hdfs_cleaner an an-coord node to clean HDFS /tmp dir

https://gerrit.wikimedia.org/r/548468

Change 548468 merged by Ottomata:
[operations/puppet@production] Include hdfs_cleaner an an-coord node to clean HDFS /tmp dir

https://gerrit.wikimedia.org/r/548468

Change 548783 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[analytics/refinery/source@master] HDFSCleaner - use Path toString in log messages instead of getName

https://gerrit.wikimedia.org/r/548783

Change 548783 merged by Ottomata:
[analytics/refinery/source@master] HDFSCleaner - use Path toString in log messages instead of getName

https://gerrit.wikimedia.org/r/548783

Moving back to in progress to use HDFS Trash

Change 548850 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Use HDFS trash settings as default everywhere

https://gerrit.wikimedia.org/r/548850

Change 548909 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[analytics/refinery/source@master] HDFSCleaner improvements

https://gerrit.wikimedia.org/r/548909

On an-coord1001 I can see:

-- Logs begin at Tue 2019-11-05 13:34:31 UTC, end at Wed 2019-11-06 07:06:56 UTC. --
Nov 05 23:00:01 an-coord1001 systemd[1]: Started Run the HDFSCleaner job to keep HDFS /tmp dir clean of old files..
Nov 05 23:00:02 an-coord1001 java[195946]: Error: Could not find or load main class classpath)
Nov 05 23:00:02 an-coord1001 systemd[1]: hdfs-cleaner.service: Main process exited, code=exited, status=1/FAILURE
Nov 05 23:00:02 an-coord1001 systemd[1]: hdfs-cleaner.service: Unit entered failed state.
Nov 05 23:00:02 an-coord1001 systemd[1]: hdfs-cleaner.service: Failed with result 'exit-code'.

The unit failed, I have acked the alarm in Icinga :)

Change 549092 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[analytics/refinery@master] Add bin/hdfs-cleaner wrapper script

https://gerrit.wikimedia.org/r/549092

Change 549092 merged by Ottomata:
[analytics/refinery@master] Add bin/hdfs-cleaner wrapper script

https://gerrit.wikimedia.org/r/549092

Change 548909 merged by Ottomata:
[analytics/refinery/source@master] HDFSCleaner improvements

https://gerrit.wikimedia.org/r/548909

Change 548850 abandoned by Ottomata:
Use HDFS trash settings as default everywhere

Reason:
Not needed.

https://gerrit.wikimedia.org/r/548850

Change 549207 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] HDFSCleaner - change command args

https://gerrit.wikimedia.org/r/549207

Change 549207 merged by Ottomata:
[operations/puppet@production] HDFSCleaner - change command args

https://gerrit.wikimedia.org/r/549207