Page MenuHomePhabricator

Setup a proper deployment strategy for Kubernetes
Closed, DuplicatePublic

Description

Currently we have badly built static debs that are deployed. These are go binaries, and debs aren't the best way to deploy them. They are also built on my local machine (gulp!) which is obviously a terrible idea.

This task is for both figuring out an automated way to build them, as well as a way to deploy them (scap3 eventually, maybe?)

Event Timeline

Kubernetes 1.2 is out now, so we can use the opportunity to upgrade + use new setup.

Here's what needs to happen when we 'deploy a new version of kubernetes':

On a deployment/build host:

  1. Push a tag to our kubernetes gerrit repo with the version we want to build
  2. Kick off a build on the deployment host. Note that this downloads a significant portion of the internet and also requires Docker (https://github.com/kubernetes/kubernetes/tree/master/build has more info)
  3. Wait for the build to finish (this can take a while!)
  4. The build produces a tarball, which we'll extract to a web root under a versioned path.

and then...

On master node:

  1. Stop the three services that will be running (apiserver, scheduler, controller-manager)
  2. wget the three binaries of given version, replace the current binaries
  3. Start them back up, and make sure they're working ok!

On worker nodes:

  1. Stop the two services that are running (kubelet, kube-proxy)
  2. wget the three binaries of given version, replace current binaries
  3. Start them back up, make sure they are working ok!

So the deploy should take a version as a parameter. This allows us to 'roll back' by just deploying an earlier version.

Things Scap needs to have to make this possible:

  1. Ability to setup a deployment server without bringing in all of mediawiki, salt, trebuchet, trebuchet-trigger, redis, etc.
  2. Ability to just curl from a webserver instead of using git (because binaries!)

Change 279648 had a related patch set uploaded (by Yuvipanda):
tools: Add class that helps build kubernetes

https://gerrit.wikimedia.org/r/279648

Intermediate alternative if we can't actually get scap to do this in the meantime:

  1. The k8s build script extracts the tar into a well known location with version numbers
  2. A simple static file server serves these binary blobs
  3. We've simple scripts on the master and worker nodes that take a parameterized version and pull these files down and restart services.

We'll run these commands on the nodes via old fashioned ssh / clustershell / yuvi-types-furiously-on-multiple-tabs methods.

Things Scap needs to have to make this possible:

  1. Ability to setup a deployment server without bringing in all of mediawiki, salt, trebuchet, trebuchet-trigger, redis, etc.

This is just a minor refactor of the scap3 classes in puppet. Actually it's mostly there already but probably needs a bit more separation of concerns.

  1. Ability to just curl from a webserver instead of using git (because binaries!)

git-fat, git-annex or git-lfs all provide good ways to do something like this. We already support git-fat, plan to support git-annex and are also looking into git-lfs. Of course, it wouldn't be difficult to provide a way to just curl a file either.

I think almost all of the requirements for a really minimal deployment server can be satisfied by scap::deploy_host once this merges: https://gerrit.wikimedia.org/r/#/c/279198/5

mmodell moved this task from Needs triage to Services improvements on the Scap board.
mmodell moved this task from Services improvements to Scap3-Adoption-Phase1 on the Scap board.
mmodell edited projects, added Scap (Scap3-Adoption-Phase1); removed Scap.

Change 279648 merged by Yuvipanda:
tools: Add class that helps build kubernetes

https://gerrit.wikimedia.org/r/279648

Our current ghetto one worked fine for the 1.2 upgrade. I'm going to let this be while other services migrate to scap3, and pick this up in a month.

yuvipanda lowered the priority of this task from High to Medium.May 3 2016, 11:41 AM

It's been a month, and this actually continues to work out ok for us...

Cleaning up some old workboards.

Is this task still needed? If so, is scap3 still a part of the discussion?