The access for Razzi Abuissa was removed. It needs to be checked if data was left in home dirs on stat*/HDFS since they were part of the "analytics-privatedata-users" group.
The Kerberos principal has already been removed.
The access for Razzi Abuissa was removed. It needs to be checked if data was left in home dirs on stat*/HDFS since they were part of the "analytics-privatedata-users" group.
The Kerberos principal has already been removed.
====== stat1004 ====== total 513244 drwxr-xr-x 2 26051 wikidev 4096 Jul 20 2021 hdfs-namenode-fsimage -rw-rw-r-- 1 26051 wikidev 1245367 Jan 10 16:42 part.txt -rw-r--r-- 1 26051 wikidev 3155 Oct 28 2020 razzi-key.txt drwxrwxr-x 11 26051 wikidev 4096 Mar 16 2021 refinery -rw-r--r-- 1 root root 524288000 May 18 2021 test.img drwxrwxr-x 6 26051 wikidev 4096 Dec 7 2020 venv drwxrwxr-x 6 26051 wikidev 4096 Dec 7 2020 venv3 ====== stat1005 ====== total 102740 drwxrwxr-x 16 26051 wikidev 4096 Feb 3 16:55 amundsen -rw-r--r-- 1 26051 wikidev 64837 Feb 9 2021 Detailed_Pageview_Report.ipynb drwxr-xr-x 11 26051 wikidev 4096 Jun 11 2020 neo4j-community-4.0.6 -rw-rw-r-- 1 26051 wikidev 105113455 Jun 16 2020 neo4j-community-4.0.6-unix.tar.gz -rw-rw-r-- 1 26051 wikidev 0 Feb 3 17:00 neo4j_log.txt -rw-rw-r-- 1 26051 wikidev 136 Feb 3 16:55 run_neo4j.sh -rw-rw-r-- 1 26051 wikidev 141 Oct 6 2020 test.hql -rw-r--r-- 1 26051 wikidev 833 Oct 6 2020 Untitled.ipynb drwxr-xr-x 7 26051 wikidev 4096 Oct 6 2020 venv ====== stat1006 ====== total 8 -rw-r--r-- 1 26051 wikidev 72 Oct 8 2020 Untitled.ipynb drwxr-xr-x 7 26051 wikidev 4096 Oct 8 2020 venv ====== stat1007 ====== total 422244 drwxrwxr-x 3 26051 wikidev 4096 Oct 30 2020 check_maxmind_backup -rw-r--r-- 1 root root 49287 Oct 6 2020 Detailed_Pageview_Report.ipynb drwxr-xr-x 9 26051 wikidev 4096 Jul 2 2021 elasticsearch-7.13.3 -rw-rw-r-- 1 26051 wikidev 327177336 Jul 7 2021 elasticsearch-7.13.3-linux-x86_64.tar.gz drwxr-xr-x 11 26051 wikidev 4096 Jun 11 2020 neo4j-community-4.0.6 -rw-rw-r-- 1 26051 wikidev 105113455 Jun 16 2020 neo4j-community-4.0.6-unix.tar.gz -rw-rw-r-- 1 26051 wikidev 0 Feb 3 16:49 neo4j_log.txt -rw-rw-r-- 1 26051 wikidev 135 Feb 3 16:49 run_neo4j.sh drwxrwxr-x 11 26051 wikidev 4096 Oct 20 2020 source ====== stat1008 ====== total 552944 drwxrwxr-x 17 26051 wikidev 4096 Feb 4 15:24 amundsen drwxrwxr-x 2 26051 wikidev 4096 Mar 14 18:10 bin drwxrwxr-x 2 26051 wikidev 4096 Mar 14 18:10 compiler_compat -rw------- 1 26051 wikidev 101524037 Mar 14 18:09 conda_dist_env.2022-03-14T18.04.00.tgz drwxrwxr-x 11 26051 wikidev 4096 Mar 14 18:12 conda_karapace drwxrwxr-x 2 26051 wikidev 4096 Mar 14 18:10 conda-meta lrwxrwxrwx 1 26051 wikidev 20 Feb 4 15:30 elasticsearch -> elasticsearch-7.13.3 drwxr-xr-x 10 26051 wikidev 4096 Feb 3 05:57 elasticsearch-7.13.3 -rw-rw-r-- 1 26051 wikidev 327177336 Jul 7 2021 elasticsearch-7.13.3-linux-x86_64.tar.gz drwxrwxr-x 4 26051 wikidev 4096 Apr 14 2021 flerovium_backup drwxrwxr-x 8 26051 wikidev 4096 Mar 14 18:10 include -rw-r--r-- 1 26051 wikidev 736 Mar 14 18:14 karapace.config.json drwxrwxr-x 15 26051 wikidev 4096 Mar 14 18:10 lib lrwxrwxrwx 1 26051 wikidev 22 Feb 3 18:25 neo4j -> neo4j-community-3.5.30 drwxr-xr-x 11 26051 wikidev 4096 Feb 3 19:25 neo4j-community-3.5.30 -rw-r--r-- 1 26051 wikidev 137349874 Feb 3 05:44 neo4j-community-3.5.30-unix.tar.gz -rw------- 1 26051 root 301 Sep 14 2020 piwik.pw -rw-rw-r-- 1 26051 wikidev 220 Oct 7 2020 popular_pages.hql -rw-r--r-- 1 26051 wikidev 29543 Apr 16 2021 pyspark_install_pandas.ipynb -rw-r--r-- 1 26051 wikidev 39947 Apr 16 2021 pyspark_wmfdata.ipynb drwxrwxr-x 9 26051 wikidev 4096 Mar 14 18:10 share drwxrwxr-x 3 26051 wikidev 4096 Mar 14 18:10 ssl -rw-rw-r-- 1 26051 wikidev 7 Sep 15 2020 testsecret.txt drwxrwxr-x 3 26051 wikidev 4096 Mar 14 18:10 x86_64-conda_cos6-linux-gnu drwxrwxr-x 3 26051 wikidev 4096 Mar 14 18:10 x86_64-conda-linux-gnu ======= HDFS ======== Found 7 items drwx------ - razzi razzi 0 2021-04-01 00:00 /user/razzi/.Trash drwxr-xr-x - razzi razzi 0 2021-11-09 02:59 /user/razzi/.sparkStaging drwx------ - razzi razzi 0 2021-04-16 20:32 /user/razzi/.staging drwxr-xr-x - razzi razzi 0 2020-10-15 20:06 /user/razzi/16 -rw-r----- 3 analytics analytics 17 2021-03-19 21:06 /user/razzi/mysql-analytics-client-pw.txt drwxr-x--- - analytics analytics 0 2021-03-23 01:08 /user/razzi/sqoop drwxr-x--- - razzi razzi 0 2021-05-25 18:10 /user/razzi/testdir ====== Hive ========= drwxr-x--- - razzi analytics-privatedata-users 0 2021-04-16 20:03 /user/hive/warehouse/razzi.db/banner_history drwxr-x--- - razzi analytics-privatedata-users 0 2021-04-16 20:32 /user/hive/warehouse/razzi.db/banner_history2 drwxr-x--- - razzi analytics-privatedata-users 0 2020-10-06 19:57 /user/hive/warehouse/razzi.db/test
I've reviewed everything above and it can all be safely deleted. An admin needs to do this, with cumin, see instructions (ping @Ottomata) The HDFS and Hive stuff is done, I took care of it.
I have carried out this removal of files.
btullis@cumin1001:~$ sudo cumin 'C:profile::analytics::cluster::client or C:profile::hadoop::master or C:profile::hadoop::master::standby' 'rm -rf /home/razzi' 15 hosts will be targeted: an-airflow1001.eqiad.wmnet,an-coord[1001-1002].eqiad.wmnet,an-launcher1002.eqiad.wmnet,an-master[1001-1002].eqiad.wmnet,an-test-client1001.eqiad.wmnet,an-test-coord1001.eqiad.wmnet,an-test-master[1001-1002].eqiad.wmnet,stat[1004-1008].eqiad.wmnet Ok to proceed on 15 hosts? Enter the number of affected hosts to confirm or "q" to quit 15 ===== NO OUTPUT ===== PASS |█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100% (15/15) [00:12<00:00, 1.18hosts/s] FAIL | | 0% (0/15) [00:12<?, ?hosts/s] 100.0% (15/15) success ratio (>= 100.0% threshold) for command: 'rm -rf /home/razzi'. 100.0% (15/15) success ratio (>= 100.0% threshold) of nodes successfully executed all commands.