Page MenuHomePhabricator

Migrate deployment of integration/config for zuul / jjb to scap deploy
Open, LowPublic

Description

integration/config.git has the configuration layout for Zuul. It is currently deployed used a fabric recipe at the root of the repo which requires shell access on the CI machine gallium.wikimedia.org

We should migrate to scap3 deploy so more people can do deployment.

Doc: https://doc.wikimedia.org/mw-tools-scap/scap3/quickstart/setup.html

Has to create a scap/scap.cfg and figure out the ssh access https://doc.wikimedia.org/mw-tools-scap/scap3/ssh-access.html#ssh-access

JJB would have to run with jenkins-jobs update --delete-old in order to delete unmaintained jobs (was T91410 , marked as dupe)

Event Timeline

hashar moved this task from Untriaged to Backlog on the Continuous-Integration-Infrastructure board.

So people will be able to deploy without having access to gallium? That seems non-ideal because they won't be able to look at zuul logs then. Or even reload the zuul service?

So people will be able to deploy without having access to gallium?

Yes that is the idea. Would let more people deploy changes if we got to grant CR+2 to more people. But the real reason is solely to adopt the generic deployment tool.

That seems non-ideal because they won't be able to look at zuul logs then.

I havent though about that one. Looks like we would need a task to send Zuul logs to a Logstash. Python logging can emit to syslog, or there might be a logstash plugin.

Or even reload the zuul service?

scap3 is able to handle restart/reload of service and even has rollback capability.

Change 286207 had a related patch set uploaded (by Hashar):
(WIP) Zuul deployment with scap? (WIP)

https://gerrit.wikimedia.org/r/286207

Change 286207 abandoned by Hashar:
(WIP) Zuul deployment with scap? (WIP)

https://gerrit.wikimedia.org/r/286207

Will stick to deb packages for now. There is only a couple host that relies on it and we only upgrade it once in a while.

Reopening, wrong task / I am confused.

hashar renamed this task from Migrate Zuul deployment of integration/config to scap deploy to Migrate deployment of integration/config for zuul to scap deploy.Feb 7 2019, 9:39 AM

There is no bandwidth to use scap to deploy integration/config changes. For now we use a Fabric script at the root of integration/config which ssh to the machine and do the commands. That is good enough for now.

Legoktm lowered the priority of this task from Low to Lowest.

There is no bandwidth to use scap to deploy integration/config changes. For now we use a Fabric script at the root of integration/config which ssh to the machine and do the commands. That is good enough for now.

But long-term it's still something that should be done IMO. As I discovered in T236689: Upgrade integration/config to use Fabric 2.x / python3, fabric upstream is a mess and I don't think it's something we want to rely on forever. Sticking with our current fabric 1.x version isn't an option because Python 2 will go away too.

There is no bandwidth to use scap to deploy integration/config changes. For now we use a Fabric script at the root of integration/config which ssh to the machine and do the commands. That is good enough for now.

Can we do neither of these options please, avoid the complexity of scap and also not have people manually ssh to contint*? Puppet would reload/restart the zuul service when config changes and could also git pull from the repo. As long as it is limited who can merge changes that would be ok and simplify everything. Having users manually SSH to prod servers to restart stuff is really not expected as normal anymore in 2020.

hashar renamed this task from Migrate deployment of integration/config for zuul to scap deploy to Migrate deployment of integration/config for zuul / jjb to scap deploy.Jun 17 2020, 2:05 PM