Requestor provided information and prerequisites
As part of T342331 [EPIC] Set up a sustainable tech stack for Wikidata Analytics, WMDE team require an airflow instance to run various analytics jobs, which is being worked on here T340648 [Airflow] Setup Airflow instance for WMDE.
We need a system user to help manage the WMDE related analytics/data jobs and the admin tasks around it like start, stop, restart airflow services.
Thus we need to create a system user analytics-wmde same (uid/gid) across nodes (airflow, stat100x, hadoop worker nodes, etc..). Then and add the related system user to the analtyics-privatedata-users group for the user to carry out certain functions like submitting jobs to yarn and to run regular airflow services maintenance.
Next we shall add a group of users approved by WMDE Engineering Manager for
- Managing WMDE related analytics/data jobs as Analytics WMDE Aiflow admins. Using the system user analytics-wmde
- The same group of users shall used to deploy airflow dags.