Build a recent python3-kubernetes client package to be used with tooling in Toolforge.
|Open||None||T159892 Make tools-webservice use the official kubernetes python client rather than pykube|
|Open||None||T197930 Build or backport .deb for kubernetes python client for Stretch and Buster|
root@tools-static-12:~# apt-cache policy python-k8sclient python-k8sclient: Installed: (none) Candidate: 0.3.0-1 Version table: 0.3.0-1 500 500 http://deb.debian.org/debian stretch/main amd64 Packages
This is available on stretch.
root@tools-static-12:~# apt-cache depends python-k8sclient python-k8sclient Depends: python-dateutil Depends: python-pbr Depends: python-six Depends: python-urllib3 Depends: <python:any> python Depends: <python:any> python Suggests: python-k8sclient-doc
09:09:37 0 ✓ zhuyifei1999@tools-bastion-02: ~$ apt-cache policy python-dateutil python-pbr python-six python-urllib3 python-dateutil: Installed: 1.5+dfsg-1ubuntu1 Candidate: 1.5+dfsg-1ubuntu1 Version table: *** 1.5+dfsg-1ubuntu1 0 500 http://nova.clouds.archive.ubuntu.com/ubuntu/ trusty/main amd64 Packages 100 /var/lib/dpkg/status python-pbr: Installed: (none) Candidate: 0.7.0-0ubuntu2 Version table: 0.7.0-0ubuntu2 0 500 http://nova.clouds.archive.ubuntu.com/ubuntu/ trusty/main amd64 Packages python-six: Installed: 1.9.0-1~trusty1 Candidate: 1.9.0-1~trusty1 Version table: *** 1.9.0-1~trusty1 0 1001 http://apt.wikimedia.org/wikimedia/ trusty-wikimedia/universe amd64 Packages 100 /var/lib/dpkg/status 1.5.2-1ubuntu1 0 500 http://nova.clouds.archive.ubuntu.com/ubuntu/ trusty-updates/main amd64 Packages 1.5.2-1 0 500 http://nova.clouds.archive.ubuntu.com/ubuntu/ trusty/main amd64 Packages python-urllib3: Installed: 1.7.1-1ubuntu4 Candidate: 1.7.1-1ubuntu4 Version table: *** 1.7.1-1ubuntu4 0 500 http://nova.clouds.archive.ubuntu.com/ubuntu/ trusty-updates/main amd64 Packages 100 /var/lib/dpkg/status 1.7.1-1build1 0 500 http://nova.clouds.archive.ubuntu.com/ubuntu/ trusty/main amd64 Packages
python-pbr is the only one that isn't installed by default. It's still available :)
We may want to do this with pypi2deb or something because 7.0.0 is really old. 10.0.1 is keyed off 1.15 (and fixes a security issue) and they haven't even updated to 1.16 last I checked.
Some thoughts on the current stated task goal:
- webservice is currently python2 compatible
- webservice is installed in all Toolforge Docker containers, many (most?) of which do not include python3
- We have jessie containers still and no active plan to deprecate them (which we probably need to figure out soon?)
- Python lovers are excited for the death of python2, but we will not have a python2-free distro until Bullseye which will be sometime in late 2021 (probably 2022 before it is active use in Toolforge)
Some responses and thoughts as well:
- It seems reasonable to me to stop producing webservice packages for jessie after moving to this library, leaving jessie containers to use the version they have as part of deprecation processes. I mean, Debian is doing that, right? If we think of it that way, that would kind of stop all concern about jessie with regard to the webservice package. If a new feature in webservice is needed, run it outside a container, as long as the webservice-runner still works. I do highly question how much it matters to support running the webservice frontend command inside a container anyway (as convenient as it may be).
- https://pypi.org/project/kubernetes/ <-- recent versions of the official client still supports python2, so we might be able to do this task for future proofing/scripting and just tack on py2 for webservice. However, I don't expect them to support it for long, and staying up-to-date on this library is something I consider as serious priority for security and sustainability.
- It is also important to remember that this general topic blocks Kubernetes upgrades past the 1.15 current minor version, which is not good (they are already at 1.17 upstream), which adds to weight to avoiding python3 purity for now.
Do we actually need the client library for the relatively small number of things that webservice does with the Kubernetes API? Maybe it would be less effort to roll our own minimal client using requests? Webservice really only does these things with the API:
- List Pods matching labels
- List Deployments matching labels
- List Services matching labels
- List Ingresses matching labels
- Create Deployment
- Create Service
- Create Ingress
- Create Pod
- Delete Ingress
- Delete Service
- Delete Deployment
- Delete ReplicaSet
- Delete Pod
That's only 3 actions against 5 object types. I think the code to implement all of them would be relatively short and repetitive.
TLDR: I agree. Let's do that instead.
The library does a good job of importing kubeconfig data or service token data without the user needing to think about it so much. It does error handling in a similar spirit to the go library along with object validation across the setup and tracks upstream changes eventually. Ingresses are changing in 1.18 to be possibly GA (or at least dropping the alternative betas) with a new services-api potentially therefore becoming less feature-gated--just using what we are using implies tracking changes in Kubernetes objects and validation. From that perspective, I still see some benefit since all these areas are things that can change. They might not be that hard for us to change to keep up with, though!
I agree that our current minimum use case in webservice could be resolved by reimplementing everything ourselves to avoid a deb build (and all that implies). However, any changes to authentication (such as implementing OID tokens or k8s suddenly producing a real user object) and we now have to maintain that or the single use case becomes a problem. Using the official client gives us some shot at being able to keep up and use a consistent coding style across python tooling (as long as they don't randomly drop the python client and tell us to use go one day, which isn't the trend so far). They maintain the project, fix up CVEs, etc. I think we should consider this useful for webservice and for other things such as future tooling rather than just webservice, since the team is very focused on python as a language.
All that said, I'm interested to see what kind of mess it would make to deb-ify this when we have this in the requirements.txt
certifi>=14.05.14 # MPL six>=1.9.0 # MIT python-dateutil>=2.5.3 # BSD setuptools>=21.0.0 # PSF/ZPL pyyaml>=3.12 # MIT google-auth>=1.0.1 # Apache-2.0 ipaddress>=1.0.17;python_version=="2.7" # PSF websocket-client>=0.32.0,!=0.40.0,!=0.41.*,!=0.42.* # LGPLv2+ requests # Apache-2.0 requests-oauthlib # ISC urllib3>=1.24.2 # MIT
You can get around setuptools...maybe pyyaml. Then comes the python2 problems (in terms of packaging). It might not be that bad with requests and urllib3 weakly versioned, honestly. I just get worried because of the experience we've had doing this with openstack-client packages. On jessie, there seems to be no python-certifi package, and stretch is 2016.2.28-1. We have 1.9.0-3~bpo8+1 and 1.10 for mitaka of the six library already on jessie. dateutil is at 2.4.2 on jessie. That might be our blocker right there (also ipaddress 1.0.16 and websocket 0.18.0-2). I do think that webservice should really be deprecated on jessie except for the runner, but that's a separate issue (especially if deb dependencies come into it).
The problems with deb packaging makes me wish webservice as written in go. I think I'm convincing myself by "typing it out" that the best way forward is our own mini-client like you are suggesting. It'll need to do old-style token auth and modern TLS auth, of course...probably doesn't need to understand serviceaccount secrets at all right now. Seems like a new task to spec this out makes sense. It also seems to me that new Future tooling could probably use either a different delivery method than deb packages (like maintain-kubeusers does using docker) or might want to be in a binary compiled format (go, rust, c, whatever)
random idea: would it make sense to have the webservice mechanism be a rest API that users call with a very simple script, and hide all the gory details behind a service under our control? That would greatly reduce some of the complex things your are mentioning (versioning matrix, etc), and offer other benefits. It has some challenges, like auth, etc.
We could also maintain it as a service in Kubernetes then as well :) Or install from source and use a venv on another system if we want to not have HA for free, but want the deploy to be easier to reason about. I like that idea a lot for that end of this. That would also simplify updates.
I hate the idea of maintaining our own library to talk to Kubernetes when Kubernetes maintains a perfectly good library just for that which is community developed and maintained (even if the bulk is autogenerated and it reads more like go than python in places), but the dependency hell is what pushes me in that direction.
All that said, @aborrero there would be an issue around auth. This proposed API would ideally not have everyone's PK, but I'm not sure how it could function without it. I mean, it could assume the role of everyone else's default service account, which would preserve all access restrictions in what it does, but it had better be damned sure of who made the request (probably by validating the x509 client for the request). That's probably doable, actually, making this not a problem at all. We'd simply maintain the same local backend for the gridengine setup.
Ooooorrr, that could be service in every tool namespace. That means it isn't a monolith with access to anything but itself. It runs as default service account and responds to its owner's cert to the simple commands of start, stop and restart handling all communication with k8s on its own. That would fix the auth problem without creating a global sudo of any kind. It could be expanded to include a token auth system from CI in the future, even....@bd808
However, I still see all these ideas as ideas to close this ticket as wontfix.
Proposal: Cancel this chain of tasks based on the discussion above. Create a new task to modify webservice's Kubernetes backend for the new cluster only with design being the first step.
To my mind, I almost have a small service that lives in your tool namespace and does mutual TLS in python behind uwsgi so that it can validate that your CN matches its namespace using nothing more than everything we have already built for auth and RBAC, then acts as the default service account on your behalf to build and tear down webservices almost written in my head, abstracting all this concern for the way K8s design changes over time. However, it may or may not be the best move, so I'll happily leave it to that task 😁
Well that was easy. ;)
I do think that webservice should really be deprecated on jessie except for the runner, but that's a separate issue (especially if deb dependencies come into it).
We need to just get rid of Jessie containers. I don't like managing multiple deprecation tasks at the same time when we can avoid it, so ideally we would start this process after (but soon after) we migrate all the workloads to the 2020 Kubernetes cluster. An awesome side effect of the new cluster is that we have already made it possible to inspect the global cluster state from a tool itself, so we can use that to make deprecation status dashboards that would have been very manual before.
The problems with deb packaging makes me wish webservice as written in go. I think I'm convincing myself by "typing it out" that the best way forward is our own mini-client like you are suggesting. It'll need to do old-style token auth and modern TLS auth, of course...
If our full deprecation for the legacy cluster is completed on or about 2020-02-10 as the current timeline states, I think we only need to worry about TLS auth.
probably doesn't need to understand serviceaccount secrets at all right now. Seems like a new task to spec this out makes sense.
Probably a thread jack, but I'm curious about what the utility of serviceaccounts in webservice would be. Supporting adding a serviceaccount to the pod template for a tool makes some sense, but that is just YAML submitted to the API not a runtime thing.
It also seems to me that new Future tooling could probably use either a different delivery method than deb packages (like maintain-kubeusers does using docker) or might want to be in a binary compiled format (go, rust, c, whatever)
This is certainly worth talking more about. I have always seen the deb packaging requirement as something that slows down development for Wikimedia production, but I'm not completely sold on statically linked binaries as the solution to that.
For the near term, I feel like doing a "just enough custom code to support existing use cases" approach to replacing pykube in webservice is the right thing to do. We need to unblock Kubernetes upgrades now. We do not need to solve all of our other dreams about a better cli tool to do that.
To be 100% clear, I think making a new task with a redesign on it...AND just unblocking things now with something like a new little library based on requests that could probably be repurposed for new CLI dreams later if it is an external API it ends up talking to one day.