Page MenuHomePhabricator

Add a kubernetes module to spicerack
Closed, ResolvedPublic

Description

Problem statement
We have a series of maintenance tasks that would benefit from having cookbooks being able to interact with the kubernetes api

API sketch
I would expect the interface to expose a an accessor like follows:

# Returns an object that allows to interact with the specified cluster
k8s = spicerack.kubernetes(group, name)
# Cordon a kubernetes node
k8s.get_node('kuberntes1001').cordon()
# Get all services running in a cluster
all_services = [ns.get_services() for ns in k8s.namespaces:]

I would expect to have a hierarchy of objects to follow pretty closely what the kubernetes api exposes, so the root object returned from spicerack.kubernetes representing the root of one cluster's api, then having Node, Namespace, Deployment, Service etc objects to represent what we get from the kubernetes api.

My intention is to not feature bloat the implementation, and only implement the methods/objects I need for cookbooks we're writing. The reason is that the k8s api is vast enough that we might otherwise write a lot of useless code.

  • Configuration files **

In all, I expect we will need the following configuration:
* A file containing the metadata of all of our kubernetes clusters and cluster groups

  • The kubeconfig credentials for at the very least a full admin account (for write operations) and a global read-only account (for everything else)
  • Possibly access to the service::catalog for the service/namespace/discovery names correspondence - although I'd like to keep that separated and mix the two parts within the cookbooks.

Implementation details

I'm unsure if we should use one of the many kubernetes python libraries available:

  • python-kubernetes, the official library, is quite hard to package in a recent version for bullseye given its dependencies (as @Majavah can testify). I also don't like much the interface it exposes, because it's not really pythonic.

* pykube_ng (https://pykube.readthedocs.io/en/latest/) has a very spicerack-like interface, but it needs to be packaged.

  • the other alternative is to write our own wrapper around executing kubectl shelling out (like I did for k8sh). While this might not look ideal at face value, it would allow us to cut down on external dependencies we would have to keep updated on the long run. This would be a bit more complex in terms of the amount of work we need to do to parse the status of an individual component.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

I'm unsure if we should use one of the many kubernetes python libraries available:

  • python-kubernetes, the official library, is quite hard to package in a recent version for bullseye given its dependencies (as @Majavah can testify). I also don't like much the interface it exposes, because it's not really pythonic.
  • pykube_ng (https://pykube.readthedocs.io/en/latest/) has a very spicerack-like interface, but it needs to be packaged.
  • the other alternative is to write our own wrapper around executing kubectl shelling out (like I did for k8sh). While this might not look ideal at face value, it would allow us to cut down on external dependencies we would have to keep updated on the long run. This would be a bit more complex in terms of the amount of work we need to do to parse the status of an individual component.

My 5ct to those:

  • pykube-ng: I have some experience with the old one (the kelproject one) back in the days which was very unpleasant. It might have improved, but anyways wanted to share.
  • python-kubernetes: While not being very pythonic I kind of liked it because it reassembled the API properly and did not have a bunch of compatibility issues (like pykube had). It potentially still has some in sub optimal client/server version combinations but from my experience that happens very rarely and usually in "exotic" places (an API/object not being available in the client when it was introduced in a new k8s version or an API that moved from alpha to beta and changed drastically). I think for our current use-case we should be fine.
  • kubectl: Unfortunately AFAIK the output of kubectl has no stability guarantees which could make this a bit tedious to maintain. The big plus here is that kubectl has some implementations that have no direct API counterpart. One of them being drain which we would have to implement ourselves (if the above libraries have not done so).

More in detail, I would reduce the choices to a match between python-kubernetes, which we already use in imagecatalog, and kubectl.

I started taking a look at how to implement kubectl drain and while the happy path it's quite easy to code, edge case handling might be a bit of a PITA.

I'll start working in the hypothesis we pick python-kubernetes for now.

Change 761297 had a related patch set uploaded (by Giuseppe Lavagetto; author: Giuseppe Lavagetto):

[operations/software/spicerack@master] k8s: add module

https://gerrit.wikimedia.org/r/761297

Joe triaged this task as Medium priority.

Change 761297 merged by jenkins-bot:

[operations/software/spicerack@master] k8s: add module

https://gerrit.wikimedia.org/r/761297