Page MenuHomePhabricator

Streamline Data Platform access approvals for WMF staff
Closed, ResolvedPublic

Description

WMF staff often need access to Data Platform systems. There are varying levels of possible access. We always rubber stamp approve WMF staff access. Sometimes we need to clarify which access they need, but we always approve.

Proposal: Instead of requiring approval for this access (especially analytics-privatedata-users group), automatically approve WMF staff access requests.

The current list of approvers for analytics-privatedata-users is here:

https://github.com/wikimedia/operations-puppet/blob/b7f385bf82811e65c45fc7768efc1abb44fee42b/modules/admin/data/data.yaml#L408-L413

NOTE: We should not change approvers for admin groups.

Event Timeline

SLyngshede-WMF subscribed.

@Ottomata I'm just removing the SRE-Access-Requests tag to remove this from the Clinic Duty dashboard.

This sounds sensible to me, too.
Given that we have approval for the proposal as it stands, how do we proceed?

Do we just need to update the guidelines on Wikitech?

Should we still retain the list of approvers, for when nda or other non-staff members might request access?

That and maybe some comments in puppet admin data.yaml to instruct SREs on the right thing to do?

Gehel triaged this task as Medium priority.Aug 14 2024, 8:38 AM

Change #1082826 had a related patch set uploaded (by Ottomata; author: Ottomata):

[operations/puppet@production] admin data.yaml - explicit approval is not needed for analytics-privatedata-users

https://gerrit.wikimedia.org/r/1082826

This sounds sensible to me, too.
Given that we have approval for the proposal as it stands, how do we proceed?

Do we just need to update the guidelines on Wikitech?

Should we still retain the list of approvers, for when nda or other non-staff members might request access?

For now updating wikitech and a comment in data.yaml are fine. Mid-term the approval management will move to Bitu/idm.wikimedia.org and there is already support for auto-approvals based on prior conditions (such has "has a @wikimedia.org address" and "has active NDA")

What's the rationale for treating non-staff different? Is it intentional? If they have signed the necessary NDAs (which is validated independently anyway), why does that still need explicit approval as opposed to staff.

Historically this approval step was introduced because there was a case where someone ran a very heavy Hadoop query which had impact on other users of the cluster. And the approval was introduced to make sure that new Hadoop users familiarise themselves with the constraints of running queries. But I'd expect that for non-staff (let's say a researcher working with an existing staff analyst), the same introduction would happen anyway?

Mid-term the approval management will move to Bitu/idm.wikimedia.org

COOL!

What's the rationale for treating non-staff different?

Good question. I think it was just easier to justify non approval for staff. There are more staff asking for approval and we always just auto approve, so it made sense to streamline. Non staff requests are less frequent, and I kind of like knowing about non staff that get access.

You are right though/ I don't think there are any safety reasons to need approval for non staff, and tagging Data-Engineering in the phab request should be sufficient for notification.

Change #1082826 merged by Ottomata:

[operations/puppet@production] admin - explicit approval not needed for analytics-privatedata-users

https://gerrit.wikimedia.org/r/1082826