Page MenuHomePhabricator

Add trash folder to hadoop
Closed, ResolvedPublic

Event Timeline

Nuria created this task.Mar 6 2018, 10:21 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 6 2018, 10:21 PM
Nuria assigned this task to elukey.Mar 28 2018, 10:43 PM
Nuria triaged this task as High priority.
Nuria added a project: Analytics-Kanban.
elukey moved this task from Backlog to In Progress on the User-Elukey board.Mar 29 2018, 10:34 AM

Change 423156 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet/cdh@master] cdh::hadoop: add the config support for HDFS Trash

https://gerrit.wikimedia.org/r/423156

elukey added a comment.EditedMar 30 2018, 1:30 PM

Tested the two values that I've set in the above patch in labs:

elukey@hadoop-master-1:~$ hdfs dfs -rm -r -f /user/elukey/.sparkStaging
Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF-8
18/03/30 13:28:03 INFO fs.TrashPolicyDefault: Moved: 'hdfs://analytics-hadoop-labs/user/elukey/.sparkStaging' to trash at: hdfs://analytics-hadoop-labs/user/elukey/.Trash/Current/user/elukey/.sparkStaging

So here's how they work, from https://developer.ibm.com/hadoop/2015/10/22/hdfs-trash:

Deletion interval specifies how long (in minutes) a checkpoint will be expired before it is deleted. It is the value of fs.trash.interval. The NameNode runs a thread to periodically remove expired checkpoints from the file system.

Emptier interval specifies how long (in minutes) the NameNode waits before running a thread to manage checkpoints. The NameNode deletes checkpoints that are older than fs.trash.interval and creates a new checkpoint from /user/${username}/.Trash/Current. This frequency is determined by the value of fs.trash.checkpoint.interval, and it must not be greater than the deletion interval. This ensures that in an emptier window, there are one or more checkpoints in the trash.
For example, set

fs.trash.interval = 360 (deletion interval = 6 hours)
fs.trash.checkpoint.interval = 60 (emptier interval = 1 hour)

This causes the NameNode to create a new checkpoint every hour and to delete checkpoints that have existed longer than 6 hours.

In my case, I've set checkpoint every 5 mins and delete after 30, this is the status after some minutes:

elukey@hadoop-master-1:~$ hdfs dfs -ls /user/elukey/.Trash
Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF-8
Found 1 items
drwx------   - elukey hdfs          0 2018-03-30 13:28 /user/elukey/.Trash/180330133000
Nuria added a comment.Mar 30 2018, 3:22 PM

Let's talk these values with team, I was thinking trash should persist several days but that drops we do on a cron (retention) should delete skipping trash, which i think should be possible

elukey moved this task from Next Up to In Progress on the Analytics-Kanban board.Mar 30 2018, 3:34 PM

Change 423156 merged by Elukey:
[operations/puppet/cdh@master] cdh::hadoop: add the config support for HDFS Trash

https://gerrit.wikimedia.org/r/423156

Change 423613 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] profile::hadoop::common: allow to enable/disable the HDFS trash

https://gerrit.wikimedia.org/r/423613

Change 423613 merged by Elukey:
[operations/puppet@production] profile::hadoop::common: allow to enable/disable the HDFS trash

https://gerrit.wikimedia.org/r/423613

Change 423844 had a related patch set uploaded (by Elukey; owner: Elukey):
[analytics/refinery@master] Append '-skipTrash' to all the hdfs -rm invocations

https://gerrit.wikimedia.org/r/423844

Change 423845 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet/cdh@master] Add the -skipTrash option to hdfs -rm

https://gerrit.wikimedia.org/r/423845

Change 423845 merged by Elukey:
[operations/puppet/cdh@master] Add the -skipTrash option to hdfs -rm

https://gerrit.wikimedia.org/r/423845

Change 423844 merged by Elukey:
[analytics/refinery@master] Append '-skipTrash' to all the hdfs -rm invocations

https://gerrit.wikimedia.org/r/423844

Change 424237 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] role::analytics_cluster::hadoop:master|standby: enable HDFS trash

https://gerrit.wikimedia.org/r/424237

elukey moved this task from In Progress to Done on the Analytics-Kanban board.Apr 6 2018, 7:11 AM
elukey moved this task from Done to Ready to Deploy on the Analytics-Kanban board.

Change 424237 merged by Elukey:
[operations/puppet@production] role::analytics_cluster::hadoop:master|standby: enable HDFS trash

https://gerrit.wikimedia.org/r/424237

Mentioned in SAL (#wikimedia-operations) [2018-04-11T16:44:14Z] <elukey> restart hadoop hdfs namenodes on analytics100[12] to pick up HDFS Trash settings - T189051

Added documentation to https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster#recover_files_deleted_by_mistake_using_the_hdfs_CLI_rm_command?, last step is to send a mail to analytics@ (and possibly research, engineering?) to announce the new feature.

elukey moved this task from Ready to Deploy to Done on the Analytics-Kanban board.Apr 12 2018, 8:25 AM
elukey moved this task from In Progress to Done on the User-Elukey board.Apr 13 2018, 7:51 AM
Nuria closed this task as Resolved.Apr 17 2018, 3:01 AM