Page MenuHomePhabricator

WMCS cookbooks: provide shared hosts for people without global root privileges
Closed, ResolvedPublic

Description

At the moment WMCS cookbooks can be run from either cloudcumin hosts, or local laptops. Running from a shared host has some advantages (e.g. automatic updates, shared logs, using screen/tmux), but it's currently only possible for users with root privileges who can "sudo" on cloudcumin hosts. Other users have to fall back to running cookbooks from their laptop.

Ideally, we would like to provide one ore more shared hosts where different types of users can run WMCS cookbooks from:

  • members of the WMCS team (without global root)
  • members of other WMF teams (without global root)
  • volunteers working on WMCS admin tasks
  • WMCS users managing a CloudVPS project

For each category of users, we need to decide the best/easiest way for them to run cookbooks:

  • from cloudcuminXXXX hosts
  • from CloudVPS shared cumin hosts (e.g. cloud-cumin-03.cloudinfra.eqiad1.wikimedia.cloud)
  • from CloudVPS dedicated cumin hosts (only for members of a CloudVPS project)
  • from a laptop (as a last fallback, ideally this should not be needed)

I'm creating this parent task to discuss various use cases and possible implementations. I'm adding as sub-tasks some of the technical challenges:

  • T325067 cloudcumin: decide sudoers rules for users without global root
  • T343335 spicerack: sal_logger does not work when running from CloudVPS instances
  • T343336 spicerack: sal_logger does not work when running from a laptop
  • T344412 Cloudcumin Gaps

Philosophical question: is Spicerack the right tool for all of these use cases? Could some CloudVPS tasks be performed with other tools, e.g. Terraform?

Event Timeline

fnegri renamed this task from WMCS cookbooks: make them available outside of the WMCS team to WMCS cookbooks: provide shared hosts for people without root privileges.Aug 9 2023, 3:25 PM
fnegri updated the task description. (Show Details)

I have major concerns about indefinitely postponing the implementation of T325067: cloudcumin: decide sudoers rules for users without global root. This is effectively creating a split where non-staff, who've previously had roughly the equivalent access as staff do in almost the entire WMCS infrastructure, will now feel like second-class citizens compared to those who have access to the new and shiny hosts. I'm also worried that it'll introduce a bunch of subtle bugs in the differences between local and cloudcumin setups, for example with the various sudo hacks currently present in the repository.

Philosophical question: is Spicerack the right tool for all of these use cases? Could some CloudVPS tasks be performed with other tools, e.g. Terraform?

My feeling is that the current setup should focus on the needs of WMCS-maintained projects (Cloud VPS, Toolforge, etc.). Other WMF people who're used to Spicerack might also be interested in using some variation of the setup for their projects, but Terraform and similar projects is likely a better option to those not familiar with either of the options.

@taavi thanks for your comment. I think I can speak on behalf of the WMCS team if I say that we definitely do not want volunteers (and staff without global root) to feel like second-class citizens. I agree that T325067 is an important task and I created this parent task to provide more context and better define the use cases that make it important.

My feeling is that the current setup should focus on the needs of WMCS-maintained projects (Cloud VPS, Toolforge, etc.).

This feels like a good perimeter to me, and means that we could leave out of scope (at least initially) the last user group I defined in the task description ("WMCS users managing a CloudVPS project"), while keeping in scope the third group ("volunteers working on WMCS admin tasks").

I will discuss this with @Volans later this month (I'm on holiday next week), but in the meantime I think another thing that would be useful is to describe in the comments below specific scenarios, e.g. referencing specific tasks or activities where not having access to a shared cookbook host is a hindrance. If we spell out a few use cases, we will clarify the used needs, and we will strengthen the case for having a shared host that is accessible to non-staff/non-roots.

fnegri triaged this task as Medium priority.Aug 10 2023, 9:24 AM

I will discuss this with @Volans later this month (I'm on holiday next week), but in the meantime I think another thing that would be useful is to describe in the comments below specific scenarios, e.g. referencing specific tasks or activities where not having access to a shared cookbook host is a hindrance. If we spell out a few use cases, we will clarify the used needs, and we will strengthen the case for having a shared host that is accessible to non-staff/non-roots.

Thanks!

One current example is the Toolofrge Kubernetes upgrade process, which is fairly often something I do. T343869: Turn wmcs-k8s-node-upgrade.py into a set of cookbooks is turning a current script into proper cookbooks, and the full cluster upgrade takes a while so it'd be great if it could run in a tmux on a cloudcumin host so I didn't have to worry about my laptop suspending or losing network connectivity for a couple of hours.

taavi renamed this task from WMCS cookbooks: provide shared hosts for people without root privileges to WMCS cookbooks: provide shared hosts for people without global root privileges.Aug 10 2023, 12:44 PM

Could some CloudVPS tasks be performed with other tools, e.g. Terraform?

On 2023-08-10 Hashicorp announced their intent to re-license Terraform and other products under the non-OSI approved Business Source License. I believe this change should cause a general re-evaluation of WMCS efforts to support or use their tool.

fnegri added a subscriber: jbond.

Could some CloudVPS tasks be performed with other tools, e.g. Terraform?

On 2023-08-10 Hashicorp announced their intent to re-license Terraform and other products under the non-OSI approved Business Source License. I believe this change should cause a general re-evaluation of WMCS efforts to support or use their tool.

hot of the press but perhpas https://opentf.org/announcement

fnegri claimed this task.
fnegri moved this task from Backlog to Done on the cloud-services-team (FY2023/2024-Q3-Q4) board.

There are still some gaps (T344412: cloudcumin: support reimage and other operations) but I think the main requirement of this task is now satisfied: users without global root can run WMCS cookbooks from cloudcuminXXXX hosts, if they are added to the wmcs-roots group in modules/admin/data/data.yaml.

If there are additional requirements, I would suggest creating new and more specific tasks.