As HDFS user I want datasets to have meaningful permissions so that access to PII data is better controlled.
In T270629 it was decided to strengthen the permissions of the datasets by removing read perms to others.
Search application uses analytics-search-users and most its application should run under the analytics-search.
Datasets and current permissions are as follow:
drwxr-x--- - analytics-search analytics-search-users 0 2021-01-07 01:01 /wmf/data/discovery/cirrus_namespace_index_map drwxrwxr-x - analytics-search analytics-search-users 0 2021-01-07 00:41 /wmf/data/discovery/fulltext_head_queries drwxrwxr-x - ebernhardson analytics-search-users 0 2020-06-16 00:30 /wmf/data/discovery/glent drwxrwxr-x - analytics-search analytics-search-users 0 2019-10-21 01:39 /wmf/data/discovery/mjolnir drwxrwxr-x - analytics-search analytics-search-users 0 2020-11-10 16:08 /wmf/data/discovery/ores drwxrwxr-x - analytics-search analytics-search-users 0 2016-02-02 20:12 /wmf/data/discovery/popularity_score drwxr-xr-x - analytics-search analytics-search-users 0 2020-01-15 17:54 /wmf/data/discovery/popularity_score_esbulk drwxrwxr-x - analytics-search analytics-search-users 0 2020-07-15 23:50 /wmf/data/discovery/popularity_score_v2 drwxrwxr-x - ebernhardson analytics-search-users 0 2019-12-21 00:52 /wmf/data/discovery/query_clicks drwxr-xr-x - analytics-search analytics-search-users 0 2020-07-20 21:10 /wmf/data/discovery/reports drwxrwxr-x - ebernhardson analytics-search-users 0 2020-05-05 23:10 /wmf/data/discovery/search_satisfaction drwxr-xr-x - analytics-search analytics-search-users 0 2021-01-04 20:19 /wmf/data/discovery/transfer_to_es drwxr-xr-x - analytics-search analytics-search-users 0 2020-07-14 12:37 /wmf/data/discovery/wdqs drwxrwxr-x - analytics-search analytics-search-users 0 2020-07-14 12:33 /wmf/data/discovery/wikidata
dataset | PII | obsolete |
cirrus_namespace_index_map | no | no |
fulltext_head_queries | yes | no |
glent | yes | no |
mjolnir | yes | no |
ores | no | no |
popularity_score | no | yes |
popularity_score_esbulk | no | yes |
popularity_score_v2 | no | no |
query_clicks | yes | no |
reports | yes | no? |
search_satisfaction | yes | no |
transfer_to_es | no | no |
wdqs | no | yes |
wikidata | no | no |
Giving access to only analytics-search-users will prevent other users even analytics-privatedata-users while there does not seem to be any reason why a analytics-privatedata-users could not read search datasets, we should perhaps chgrp analytics-privatedata-users all search datasets including PII and leave analytics-search-users with o+rx for others?
AC:
- perms of all /wmf/data/discovery/ are clarified