Page MenuHomePhabricator

Requesting access to analytics-privatedata-users for resquito
Closed, ResolvedPublicRequest

Description

Requestor provided information and prerequisites

Complete ALL items below as the individual person who is requesting access:

  • Wikimedia developer account username: resquito
  • Email address: resquito@wikimedia.org
  • SSH public key (must be a separate key from Wikimedia cloud SSH access): ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIPOnl0n6sNb8asQv76lGI+oYo7pmVFJ62A3zQxfGgLMT
  • Requested group membership: analytics-privatedata-users
  • Reason for access: Access to Airflow data analytics
  • Name of approving party (manager for WMF/WMDE staff): @HShaikh
  • Ensure you have signed the L3 Wikimedia Server Access Responsibilities document:
  • Please coordinate obtaining a comment of approval on this task from the approving party.

SRE Clinic Duty Confirmation Checklist for Access Requests

This checklist should be used on all access requests to ensure that all steps are covered, including expansion to existing access. Please double check the step has been completed before checking it off.

This section is to be confirmed and completed by a member of the SRE team.

  • - User has signed the L3 Acknowledgement of Wikimedia Server Access Responsibilities Document.
  • - User has a valid NDA on file with WMF legal. (All WMF Staff/Contractor hiring are covered by NDA. Other users can be validated via the NDA tracking sheet)
  • - User has provided the following: developer account username, email address, and full reasoning for access (including what commands and/or tasks they expect to perform)
  • - User has provided a public SSH key. This ssh key pair should only be used for WMF cluster access, and not shared with any other service (this includes not sharing with WMCS access, no shared keys.)
  • - The provided SSH key has been confirmed out of band and is verified not being used in WMCS.
  • - access request (or expansion) has sign off of WMF sponsor/manager (sponsor for volunteers, manager for wmf staff)
  • - access request (or expansion) has sign off of group approver indicated by the approval field in data.yaml

For additional details regarding access request requirements, please see https://wikitech.wikimedia.org/wiki/Requesting_shell_access

Event Timeline

this ticket is a prerequisite for https://phabricator.wikimedia.org/T396672 and that @dr0ptp4kt is also readying a patch for additional access in https://gerrit.wikimedia.org/r/c/operations/puppet/+/1165605 to be taken out of WIP once your initial SSH access is established.

@HShaikh can you approve? Thx

Hi @REsquito-WMF: I am trying to understand if analytics-privatedata-users is really required for this. Can you clarify the reason for your access a bit more? I am also looking at the other patch you referenced and I am just wondering if that is enough.

HI

I will need acess to data lake, hive, and others.

Also Adam Baso just mentioned to me that I missing wmf group.

HI

I will need acess to data lake, hive, and others.

Also Adam Baso just mentioned to me that I missing wmf group.

Thanks. For being added to the WMF group, please follow the instructions at https://wikitech.wikimedia.org/wiki/SRE/LDAP/Groups/Request_access#Using_the_Wikimedia_Identity_Management_System and request access through the online system.

I will take care of the other request shortly.

Thanks @ssingh - I'm wondering, should we create a subheading between https://wikitech.wikimedia.org/wiki/SRE/Production_access#Generating_your_SSH_key and https://wikitech.wikimedia.org/wiki/SRE/Production_access#Filing_the_request to suggest that folks use the idm.wikimedia.org / BITU web interface to request the wmf / nda LDAP group membership as appropriate? Or would it make more sense for the provisioning of wmf / nda group membership as part of the Phabricator template for the SRE person doing the access (or maybe both? and I guess we could add it to the requestor part of the template, too, as yet another prompt)? (In that latter case, I'm wondering if there's anything inside of https://wikitech.wikimedia.org/wiki/SRE/Clinic_Duty/Access_requests that we should update as well)? I can see the case either way, main thing is it being step-by-step, I suppose.

Depending on approach there, I think we'll need to adapt https://wikitech.wikimedia.org/wiki/Data_Platform/Data_access#Access_Groups and https://wikitech.wikimedia.org/wiki/Data_Platform/Data_access#Analytics_shell_groups_explained a little as well as a couple parts of Product/Engineering/Onboarding/Checklists/Template on OfficeWiki (historically, we point people to https://phabricator.wikimedia.org/project/view/1564/ but now https://phabricator.wikimedia.org/project/view/1564/ actually points people at BITU / idm.wikimedia.org). I see that Technology/Onboarding/Checklists/Template was updated between last autumn (northern hemisphere) and now to point to the LDAP page, but it seems that Product/Engineering/Onboarding/Checklists/Template was not, so I think I see the source of some of the divergence! Of course it would be nice to consolidate these onboarding templates further, although that's probably a separate matter for a separate day...just good to deal with this common scenario, I think.

Change #1170571 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] admin: add resquito to analytics-privatedata-users

https://gerrit.wikimedia.org/r/1170571

Hi @REsquito-WMF: I am trying to understand if analytics-privatedata-users is really required for this. Can you clarify the reason for your access a bit more? I am also looking at the other patch you referenced and I am just wondering if that is enough.

For this part, he'll need to run SQL queries in the data lake in order to query / troubleshoot queries ahead of any of the extra permissions for the Airflow-specific group. We should grant analytics-privatedata-users in this here ticket.

(EDIT: and to clarify, I'll expect that both via UI and at the shell.)

Now, my thought was to do that other access as a separate ticket closer to its basis, rather than co-mingle it here. Hope that's okay!

Hi @dr0ptp4kt:

Thanks @ssingh - I'm wondering, should we create a subheading between https://wikitech.wikimedia.org/wiki/SRE/Production_access#Generating_your_SSH_key and https://wikitech.wikimedia.org/wiki/SRE/Production_access#Filing_the_request to suggest that folks use the idm.wikimedia.org / BITU web interface to request the wmf / nda LDAP group membership as appropriate? Or would it make more sense for the provisioning of wmf / nda group membership as part of the Phabricator template for the SRE person doing the access (or maybe both? and I guess we could add it to the requestor part of the template, too, as yet another prompt)? (In that latter case, I'm wondering if there's anything inside of https://wikitech.wikimedia.org/wiki/SRE/Clinic_Duty/Access_requests that we should update as well)? I can see the case either way, main thing is it being step-by-step, I suppose.

Thanks for the feedback! I will note that I am just the clinic duty person for this week and not the owner of this (that would be Moritz and Simon), so please factor that in for the rest of my response below.

The web interface requests process for at least the wmf group through Bitu/IDM is fairly recent. We have updated the shift to the new system and that is reflected in https://phabricator.wikimedia.org/project/profile/1564/ and https://wikitech.wikimedia.org/wiki/SRE/Clinic_Duty/Access_requests#WMF_Group. I guess what you are suggesting, though, is that the relation between the other access requests -- such as analytics-privatedata-users -- and wmf groups is not clear? Or in general the shift itself?

Depending on approach there, I think we'll need to adapt https://wikitech.wikimedia.org/wiki/Data_Platform/Data_access#Access_Groups and https://wikitech.wikimedia.org/wiki/Data_Platform/Data_access#Analytics_shell_groups_explained a little as well as a couple parts of Product/Engineering/Onboarding/Checklists/Template on OfficeWiki (historically, we point people to https://phabricator.wikimedia.org/project/view/1564/ but now https://phabricator.wikimedia.org/project/view/1564/ actually points people at BITU / idm.wikimedia.org). I see that Technology/Onboarding/Checklists/Template was updated between last autumn (northern hemisphere) and now to point to the LDAP page, but it seems that Product/Engineering/Onboarding/Checklists/Template was not, so I think I see the source of some of the divergence! Of course it would be nice to consolidate these onboarding templates further, although that's probably a separate matter for a separate day...just good to deal with this common scenario, I think.

Yes, that's fair; I think it's clear even based on the clinic duty this week that there is some confusion around this. I will bring this up internally and to Moritz/Simon later. I guess an outsider's perspective -- outside of SRE -- is very helpful in understanding how people view the access request system, so thanks for sharing.

Thanks @ssingh ! I think it's probably just a matter of updating the pages. I've had my access for a good while now, and I bet these recent access things just signify the need to update a few of the pages. Now, as to the matter of whether it would make sense to provide wmf / nda at the same time the person puts in the analytics-* access request, I'd be curious to hear where you, Moritz, and a Simon land; if users should manually request their additional wmf LDAP access themselves, that's okay, it's just a slight extra step (hence why I might suggest we also update the Phabricator template), and if it's done as routine part of access provisioning by the clinic duty person that's of course easier (although may be prone to some slight back-and-forth). One thing I wasn't sure of is if there are any contingencies around scripts that run as part of provisioning or off-boarding that have certain assumptions, and how that may interplay here. Not sure if that makes any sense. Anyway, I'll await your word - please do let me know and I can update wiki pages!

@REsquito-WMF would you please advise upon your "Request this permission" click in https://idm.wikimedia.org/permissions/ for the wmf group?

Hi @REsquito-WMF: I am trying to understand if analytics-privatedata-users is really required for this. Can you clarify the reason for your access a bit more? I am also looking at the other patch you referenced and I am just wondering if that is enough.

For this part, he'll need to run SQL queries in the data lake in order to query / troubleshoot queries ahead of any of the extra permissions for the Airflow-specific group. We should grant analytics-privatedata-users in this here ticket.

I have added a patch for at least analytics-privatedata-users and will merge once reviewed.

(EDIT: and to clarify, I'll expect that both via UI and at the shell.)

Now, my thought was to do that other access as a separate ticket closer to its basis, rather than co-mingle it here. Hope that's okay!

We usually keep one task per person, and multiple requests within that task are perfectly fine, IMO.

Change #1170571 merged by Ssingh:

[operations/puppet@production] admin: add resquito to analytics-privatedata-users

https://gerrit.wikimedia.org/r/1170571

@REsquito-WMF: Your access request has been merged. Please allow ~30 minutes for it to roll out. I have also added you to the wmf group.

Kerberos credentials sent:

sukhe@krb1002:~$ sudo manage_principals.py create resquito --email_address=resquito@wikimedia.org
Principal successfully created. Make sure to update data.yaml in Puppet.
Successfully sent email to resquito@wikimedia.org

I will leave this task open in case you have any questions or something doesn't work out.

Please re-open if there are any issues.

Thanks @ssingh !

i'll leave feedback once i test it today.

Hi @ssingh , over in https://gerrit.wikimedia.org/r/c/operations/puppet/+/1165605 I added the analytics-privatedata-users piece for this ticket, as well as the Airflow part for T396672: Request for dedicated Airflow instance for WME.

Looping @BTullis here for visibility. I'll post on the other ticket as well for cross-referencing.

Change #1165605 had a related patch set uploaded (by Dr0ptp4kt; author: Dr0ptp4kt):

[operations/puppet@production] Add access for platform engineering Airflow and data

https://gerrit.wikimedia.org/r/1165605

Change #1165605 merged by Ssingh:

[operations/puppet@production] Add access for platform engineering Airflow and data

https://gerrit.wikimedia.org/r/1165605

ssingh added a subscriber: CDobbins.

@dr0ptp4kt : Closing this task as part of the clinic duty week this week (@CDobbins' first week). Please re-open if there are any issues. Thanks!

Thanks @ssingh. For the exchange on the documentation starting from T399899#11017204 ...

I will bring this up internally and to Moritz/Simon later. I guess an outsider's perspective -- outside of SRE -- is very helpful in understanding how people view the access request system, so thanks for sharing.

... any take on what we may want to do for updates to the documentation? Over in T396672#11051093 I said I'd check about the documentation stuff here.

Thanks @ssingh. For the exchange on the documentation starting from T399899#11017204 ...

I will bring this up internally and to Moritz/Simon later. I guess an outsider's perspective -- outside of SRE -- is very helpful in understanding how people view the access request system, so thanks for sharing.

... any take on what we may want to do for updates to the documentation? Over in T396672#11051093 I said I'd check about the documentation stuff here.

Thanks for checking, Adam! Both Moritz and Simon are still out but it's in my list to follow up when they are back.