Page MenuHomePhabricator

Requesting access to Wiki Replicas end-to-end tiers for dr0ptp4kt
Closed, ResolvedPublicRequest

Description

Requestor provided information and prerequisites

ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDoKHi5isY9FixH31qz/81V7fOHsorLZI/NLKr9Z6Xawl2a2Ih0ZV/pJtD+BTu1ufK2QOdgobeRSrnybzf2/1aCqi3Z9H2XxJhMCfnLb/9AIcKJ9tN63T4nRnjLoPsmRgDQrOSIqY5NfLKzXBsQOqc3chZ5SaDf8f09OdBk+Obn5vhr6yWh4GhrfTzoZUfp6+JRiueZZYuGMIKdBAH82s9TyuhuGWvHJmO9WC1MJOV/3hIcim+X0xR+BNLEU/Uj4OPEXC0/EiXh2CJDLugBpLU28RF+Y16TRj/WmO2H0H6qVdmkiK7Ez9PCbsy4RFPq4hdART9QiQbQJzZzaYSAkSFV
        abaso@wikimedia.org
  • Requested group membership: probably ops_members, modeled after btullis / bstorm
  • Reason for access: Wiki Replicas end-to-end analysis and overview (see multiple tiers at https://wikitech.wikimedia.org/wiki/Portal:Data_Services/Admin/Wiki_Replicas)
  • Name of approving party (manager for WMF/WMDE staff): Olja Dimitrijevic
  • Ensure you have signed the L3 Wikimedia Server Access Responsibilities document: Done
  • Please coordinate obtaining a comment of approval on this task from the approving party. On it.

SRE Clinic Duty Confirmation Checklist for Access Requests

This checklist should be used on all access requests to ensure that all steps are covered, including expansion to existing access. Please double check the step has been completed before checking it off.

This section is to be confirmed and completed by a member of the SRE team.

  • - User has signed the L3 Acknowledgement of Wikimedia Server Access Responsibilities Document.
  • - User has a valid NDA on file with WMF legal. (All WMF Staff/Contractor hiring are covered by NDA. Other users can be validated via the NDA tracking sheet)
  • - User has provided the following: wikitech username, email address, and full reasoning for access (including what commands and/or tasks they expect to perform)
  • - User has provided a public SSH key. This ssh key pair should only be used for WMF cluster access, and not shared with any other service (this includes not sharing with WMCS access, no shared keys.)
  • - The provided SSH key has been confirmed out of band and is verified not being used in WMCS.
  • - access request (or expansion) has sign off of WMF sponsor/manager (sponsor for volunteers, manager for wmf staff)
  • - access request (or expansion) has sign off of group approver indicated by the approval field in data.yaml

For additional details regarding access request requirements, please see https://wikitech.wikimedia.org/wiki/Requesting_shell_access

Event Timeline

Hi @dr0ptp4kt, could you help me understand what kind of access you are after (i.e. what hosts/service/commands) ? As far as I can see analytics-privatedata-users is the group to be in for wikireplicas access in production

Hi @fgiunchedi, using the config at https://wikitech.wikimedia.org/wiki/SRE/Production_access#Setting_up_your_SSH_config (after pinning the confirmed fingerprint obtained via bast1003) I'm seeing the following:

$ ssh clouddb1017.eqiad.wmnet
Enter passphrase for key '<path_to_key>': 
dr0ptp4kt@clouddb1017.eqiad.wmnet: Permission denied (publickey,keyboard-interactive).

Any pointers?

As far as commands, generally the ones listed in https://wikitech.wikimedia.org/wiki/Portal:Data_Services/Admin/Wiki_Replicas (besides surely heavy use of read utilities; I imagine FS perms generally allow for reading but in some cases I also expect FS perms will require elevation) . I anticipate needing to modify, or at least copy-and-modify then in narrow execution invoke with flags certain scripts and commands so that I can run them on copies of elements safely (e.g., distinct safe-to-start/stop/restart haproxy service instances). The systemctl, mysql, maintain-* (because of their internals), and puppet agent commands seem likely.

Thank you for the context @dr0ptp4kt . From reading https://wikitech.wikimedia.org/wiki/Analytics/Data_access#Access_Groups it seems wiki replicas access is granted by virtue of being in analytics-privatedata-users (which you are already), though given the cloud prefix on the host you are trying I think this a wmcs thing. I'm looping in Cloud-Services to assist with this request

Thanks @fgiunchedi and thanks for working through this with me! Should I Phab-mention anyone or tag the ticket, or do the WMCS folks normally catch stuff with the project mention here in Phabricator? Now that I've looked more at the Puppet repo I see the wmcs-admin affiliation for a decent portion of the Wiki Replicas related nodes (including the clouddb* ones). That access seems like it could get me a good part of the way there.

I'll add the following context, mainly for helping remind myself or others following along some day.

What I heard in some conversations the past week was that the cloud prefix on those hostnames is overloaded. The clouddb* servers are said to be part of the prod realm, which seems to mirror the "Service architecture by layers" diagram.

The other analytics replicas (the ones most likely to be used by NDA'd data people) referenced in https://wikitech.wikimedia.org/wiki/Analytics/Data_access are the dbstore* servers, described a bit more at https://wikitech.wikimedia.org/wiki/Analytics/Systems/MariaDB#Database_setup (I can't SSH jump to those, either, although the analytics-privatedata-users access does allow the analytics-mysql wrapper script access via the mysql CLI and the SRV mappings; this is fine for now and probably won't matter, at least not yet).

The WMCS realm "replicas" are fronting proxies on the lefthand side of the "Service architecture by layers", and their naming convention is described at https://wikitech.wikimedia.org/wiki/Help:Toolforge/Database#Naming_conventions - those ones have the .cloud suffix. I'm going to Meet with Bryan and Andrew to learn a bit more about WMCS-hosted nodes after attending to some physical world obligations over the next days.

For the righthand side proxies, dbproxy1018 and dbproxy1019, I may need to look at the Puppet repo a bit more, but those look like they probably need special access (they don't seem to have a specifically defined admin group), but I can start with the other stuff first.

I'll be interested in accessing the various Linux nodes across the architecture, but starting from one clump of servers is a good start.

As far as commands, generally the ones listed in https://wikitech.wikimedia.org/wiki/Portal:Data_Services/Admin/Wiki_Replicas (besides surely heavy use of read utilities; I imagine FS perms generally allow for reading but in some cases I also expect FS perms will require elevation) . I anticipate needing to modify, or at least copy-and-modify then in narrow execution invoke with flags certain scripts and commands so that I can run them on copies of elements safely (e.g., distinct safe-to-start/stop/restart haproxy service instances). The systemctl, mysql, maintain-* (because of their internals), and puppet agent commands seem likely.

Admin actions on the Wiki Replica hosts generally require full root rights. The wmcs-admin role is fundamentally root on the WMCS infrastructure hosts, but this role has been denied having meaningful sudo rights on the Wiki Replicas out of an abundance of caution from the DBA team. See also:

We really need to come up with a way to be able to grant root access to clouddb* hosts that doesn't imply root on all the production databases, because that is really overkill. I don't know if this is something the Infrastructure-Foundations team could help with? cc @joanna_borun

I'm going to open a separate task for the wmcs-admin membership, but leave this task here open so we can continue to explore the additional matter of being

able to grant root access to clouddb* hosts that doesn't imply root on all the production databases

I'll submit a patch on the other task.

Clement_Goubert changed the task status from Open to In Progress.Aug 16 2023, 2:20 PM

We really need to come up with a way to be able to grant root access to clouddb* hosts that doesn't imply root on all the production databases, because that is really overkill. I don't know if this is something the Infrastructure-Foundations team could help with? cc @joanna_borun

FYI we are trying to come up with better solutions for this in T337848 but we may need some dba support, ill ping you on that task ;)

jbond claimed this task.

I'm going to open a separate task for the wmcs-admin membership, but leave this task here open so we can continue to explore the additional matter of being

I'm going to close this task in favour of T344599