Page MenuHomePhabricator

Check home/HDFS leftovers of aniketars
Closed, ResolvedPublic

Description

The access for Aniket Bharti was removed. It needs to be checked if data was left in home dirs on stat*/HDFS since they were part of the "analytics-privatedata-users" group.

The Kerberos principal has already been removed. Point of contact wrt keeping data is @Miriam

Event Timeline

Hi @Miriam!

These are the files that belonged to Aniket Bharti,
please can you confirm whether they should be deleted, or moved them to another location?

Thank you!

====== stat1004 ======
total 0

====== stat1005 ======
total 4556
drwxr-xr-x  3 37173 wikidev    4096 Jul  2 17:49 1. Model Selection Experiment
drwxr-xr-x 14 37173 wikidev    4096 Jul  2 18:15 3. PCA Experiments
drwxr-xr-x  7 37173 wikidev    4096 Jul  2 19:17 4. Spark
-rw-r--r--  1 37173 wikidev      81 Feb 13 08:09 bad_files.txt
-rw-r--r--  1 37173 wikidev   17818 Jul  2 16:25 CreatingDatasetSample.ipynb
-rw-r--r--  1 37173 wikidev    8686 Jul  2 16:15 ImageNameToURL.ipynb
-rw-r--r--  1 37173 wikidev       0 Feb 12 14:46 __init__.py
drwxr-xr-x  3 37173 wikidev    4096 Feb 23 12:03 meta
drwxr-xr-x  7 37173 wikidev    4096 Mar 15 10:12 models
-rw-r--r--  1 37173 wikidev 4601798 Mar  7 12:59 names.txt
drwxr-xr-x  2 37173 wikidev    4096 Feb 13 07:52 __pycache__

====== stat1006 ======
total 0

====== stat1007 ======
total 0

====== stat1008 ======
total 1507144
-rw-r--r-- 1 37173 wikidev     64254 Feb 10 14:10 annoyIndexExperiment.ipynb
-rw-r--r-- 1 37173 wikidev     56641 Feb  3 15:33 annoyIndex.ipynb
-rw-r--r-- 1 37173 wikidev 390529741 Feb  3 14:57 efficientNetB7-00001-04b253b8-db8c-4d14-a23f-3433a86841b4-c000.csv.gz
-rw-r--r-- 1 37173 wikidev 392958791 Feb  4 13:00 efficientNetB7-00002-04b253b8-db8c-4d14-a23f-3433a86841b4-c000.csv.gz
-rw-r--r-- 1 37173 wikidev     25354 Feb  3 15:14 embeddings.ipynb
-rw-r--r-- 1 37173 wikidev     52073 Feb 21 14:16 ExpendabilityOfAnnoy.ipynb
-rw-r--r-- 1 37173 wikidev    110758 Feb 10 14:13 ExperimentPCAWithEfficient.ipynb
-rw-r--r-- 1 37173 wikidev   3207364 Feb  3 15:29 indexing.csv
drwxr-xr-x 2 37173 wikidev      4096 Feb 10 13:50 meta
drwxr-xr-x 5 37173 wikidev      4096 Feb  8 13:58 model
-rw-r--r-- 1 37173 wikidev    156016 Feb  8 15:46 outsideDataPCA.ipynb
-rw-r--r-- 1 37173 wikidev      4497 Feb  2 13:15 palyground.ipynb
-rw-r--r-- 1 37173 wikidev     45557 Feb 21 14:14 ScalabilityOfAnnoyIndex.jpg
-rw-r--r-- 1 37173 wikidev    117782 Feb  4 13:00 tf_gpu.ipynb
drwxrwxr-x 5 37173 wikidev      4096 Feb  2 15:15 tf-rocm
-rw-r--r-- 1 37173 wikidev 377960484 Feb  4 14:05 tree1.ann
-rw-r--r-- 1 37173 wikidev 377960484 Feb  3 15:21 tree.ann
drwxr-xr-x 3 37173 wikidev      4096 Feb 21 14:11 trees

======= HDFS ========
Found 14 items
drwxr-x---   - aniketars aniketars          0 2022-06-27 20:19 /user/aniketars/.sparkStaging
drwxr-x---   - aniketars aniketars          0 2022-04-27 15:01 /user/aniketars/EfficientNetB3V2.parquet
drwxr-x---   - aniketars aniketars          0 2022-04-28 11:44 /user/aniketars/EfficientNetB3V2_features.parquet
drwxr-x---   - aniketars aniketars          0 2022-04-28 18:59 /user/aniketars/EfficientNetB3V2_pca256.parquet
drwxr-x---   - aniketars aniketars          0 2022-06-24 14:26 /user/aniketars/Embeddings_NSFW.parquet
drwxr-x---   - aniketars aniketars          0 2022-06-20 19:31 /user/aniketars/ImageNSFWProbs.parquet
drwxr-x---   - aniketars aniketars          0 2022-06-12 03:39 /user/aniketars/OpenNSFW.parquet
-rw-r-----   3 aniketars aniketars   40610976 2022-03-15 09:35 /user/aniketars/data.part01.tsv
drwxr-x---   - aniketars aniketars          0 2022-04-04 13:36 /user/aniketars/efficientNetB3V2.parquet
drwxr-x---   - aniketars aniketars          0 2022-03-28 12:18 /user/aniketars/idx2url.csv
drwxr-x---   - aniketars aniketars          0 2022-03-25 12:30 /user/aniketars/output.parquet
drwxr-x---   - aniketars aniketars          0 2022-03-28 11:51 /user/aniketars/outputTest.parquet
drwxr-x---   - aniketars aniketars          0 2022-04-11 13:10 /user/aniketars/pca
drwxr-x---   - aniketars aniketars          0 2022-04-11 11:28 /user/aniketars/pca_sample.parquet

====== Hive =========

Thanks @mforns ! Would it be possible to move these files to my home directories?

Thanks!

Miriam

@Miriam Sure! I can copy them to your home folder, and then when you confirm you have everything, I will delete the original ones.
In which of your home directories do you want me to put these files? HDFS? Or any particular stat machine?
Also, can you give me your username? Couldn't find it!
Cheers!

Thanks @mforns!
Could you kindly put everything on stat1005, if that is ok with you?
My username is mirrys
Thanks!

Heya @Miriam :]
I underestimated the size of the data, I'm sorry.
The part of aniketar's data that was on their hdfs home folder, I've moved over to hdfs://user/mirrys/aniketars (still hdfs - it is too big to move to your stat1005 home folder).
To move the data on stat boxes I need the help of an SRE (root access).

@BTullis, heya :], can you please move the data in stat1005:/home/aniketars and stat1008:/home/aniketars (both) to stat1005:/home/mirrys/aniketar?
If you think it's better to keep them on their current machine, then please can you mv /home/aniketars/* /home/mirrys/aniketars in both stat1005 and stat1008?
Thanks a lot!

@EChetty should I assign this to the Ops Week person for this Sprint?

BTullis triaged this task as Low priority.

Hi @Miriam - I've moved those files for you now. You should find all of the files under /home/mirrys/aniketars on stat1005.

Hope that's all OK. Let me know if you have any queries.