In today's Analytics Systems hangtime meeting, we talked with @fkaelin and @gmodena about work they want to do with Airflow. I told them that Airflow is basically ready for testing, and we could create instances for them now if they liked. They do like! They understand that we are still iterating and figuring it out for ourselves too. It will do us all good to be able to work out best practices together.
Let's create instances for them now. We haven't done this before, so we'll likely need to formalize this process. It will be something like:
1. Create new Ganeti VMs: an-airflow1002 (research), an-airflow1003 (platform eng)
2. Create new system users: analytics-research, analytics-platform-eng. Declare these system users in `profile::analytics::cluster::users` These users should also be added in admin data.yaml, but commented out until T231067 is complete (as other system users are). See `analytics-search` as an example.
3. Create new user groups analytics-research-users and analytics-platform-eng-users and include relevant users in `members` and system user in `system_members`. Members in these groups should have sudo privileges to their system user. Also include the system users in the analytics-privatedata-users group. See `analytics-search-users` as an example. We'll also need to make sure users in these groups can manage airflow services. See `airflow-search-admins` for an example. I'd prefer not to create more admin groups if we can help it.
4. Create kerberos principals and keytabs for these system users @ their airflow VM hostname following https://wikitech.wikimedia.org/wiki/Analytics/Systems/Kerberos#Create_a_keytab_for_a_service
5. Create the airflow instances on the VMs following https://wikitech.wikimedia.org/wiki/Analytics/Systems/Airflow#Creating_a_new_Airflow_Instance