Page MenuHomePhabricator

Decision request - Choose a lang for the toolforge build service API
Closed, ResolvedPublic

Description

NOTE: Follows the discussion to move to an API design or not here {T326136: Decision request - Toolforge build service to move to an API design}

Problem

We have decided to create an API for the toolforge build service, we need to choose a language for it.

Constraints and risks

  • We have considerable python knowledge in the team
  • We have little golang knowledge in the team, this might increase the time to build it
  • The k8s ecosystem is built around golang, this might ease the maintenance if golang is chosen
  • Some of our components are written in golang (all the admission/validation hooks), this might help maintaining those if golang is chosen
  • Toolforge Job service is written in python, and has a similar design, this might help reusing and or sharing knowledge if python is chosen
  • There's no strong investment either way for toolforge components, though there's no trials of API design with golang, this could give some experience on both languages before (if) deciding to converge all components to one lang/tooling

NOTE: Full list of libraries here T325382#8488267

Decision record

Option 1 was chosen.

https://wikitech.wikimedia.org/wiki/Wikimedia_Cloud_Services_team/EnhancementProposals/Decision_record_T325382_Choose_a_lang_for_the_toolforge_build_service_API

Options

Option 1

Implementing it in golang.

Pros:

  • Golang k8s libraries are the only ones manually developed by upstream, and the main upstream libraries that they design for, they have extensive concurrency support, they are typed, and have testing helpers, all the others are ether automatically generated or not maintained by upstream.
  • We gain experience with golang component design
  • We gain experience with golang as a whole (and better support current golang components)

Cons:

  • Some initial investment is needed to develop the API

Option 2

Implementing it in python.

Pros:

  • Python knowledge in the team makes it simple to use best practices
  • We can reuse some of the code already existing for toolforge-jobs-framework

Cons:

  • The python k8s libs are autogenerated from the golang ones, don't have builtin async support (they have a limited concurrency API, not using the builtin python one), type hints, or testing helpers.

Event Timeline

dcaro updated the task description. (Show Details)
dcaro changed the task status from Open to In Progress.Dec 21 2022, 8:52 AM
dcaro moved this task from To refine to Doing on the User-dcaro board.

My votes: 4 -> {3,2} -> 1

Thanks for your vote!
I would appreciate if you have any comments on any of the options or some extra rationale on why you vote this way.
I'm trying to gather opinions/discuss on pros-cons of the current options or find new more suitable ones.

I do think moving to an API based design makes sense, but I don't have a preference whether it should be written in Go or Python.

Can you share more details on which library you might be targeting? Looking at https://kubernetes.io/docs/reference/using-api/client-libraries/, I see a list of official libs, along with many that are unofficial. This would help to look at example implementations and get a feel for each library before weighing an opinion. Thanks!

Can you share more details on which library you might be targeting? Looking at https://kubernetes.io/docs/reference/using-api/client-libraries/, I see a list of official libs, along with many that are unofficial. This would help to look at example implementations and get a feel for each library before weighing an opinion. Thanks!

Sure, so I was just checking the official ones, but let's give a look.

Python

  • current libs - We are currently using a very thin, snippet of code to do requests directly:
    • one copy on each codebase that uses it
    • no types
    • no tests
    • any changes we have to do it ourselves
    • no async support
    • example
  • github.com/fiaas/k8s -- norway feed company behind (FINN.no), main contributor seems to have moved companies
    • not much activity
    • no compatibility matrix
    • does not support many resources (PSPs for example)
    • no async
    • no testing helpers
  • github.com/gtsystem/lightkube -- HERE technologies (1 person maintaining it+ some contribs from canonical)
    • few maintainers
    • very active
    • not mature (they say not ready for prod)
    • has types
    • wide support on the same library
    • async support
    • no testing helpers
  • github.com/mnubo/kubernetes-py - AspenTech
    • some maintainers
    • no activity in a while (the main contributor changed companies it seems)
    • limited resources support
    • no types
    • no async
    • no testing helpers
  • github.com/tomplus/kubernetes_asyncio - personal project
    • one main maintainer, many small contributions
    • quite active
    • similar versioning as the official library
    • partial support for streaming (read-only)
    • no test helpers
    • no types
  • github.com/Frankkkkk/pykorm - personal/sokube cloud solutions
    • few maintainers
    • little activity
    • claims not to be very stable
    • ORM pattern oriented
    • focusing on easy CRD support
    • no async support
    • no testing helpers
    • no types

Golang

  • github.com/ericchiang/k8s - archived in november (read-only)

I might have missed/misinterpreted something, if so please let me know (I'm not familiar with most of these libraries). If there's any others you want to consider, please add too.

If the main reason for this proposal was "we just want to have fun playing with & learning a new technology" then that's OK. I want to play & learn too. But that's not a technical reason :-) I don't see strong technical reasons, see below.

As you pointed already, so far we have been using 2 options to interact with k8s from python:

In my experience running any of the two is more than enough. They are robust. They work well. They are easy to use. They work well with python-flask, which is mostly what we use here for building REST APIs. In particular, the statement The python k8s libs are not as well maintained, harder to test and develop with I think is quite the opposite, actually.

We also have a few Toolforge k8s custom components written in golang. They are tiny admission webhooks that do simple things, examples: volume-admission-controller, ingress-admission-controller, registry-admission-webhook, etc.
On the other hand, unlike the previous tiny webhooks, the proposed new codebase can't be comparable in scale. It will be at least 2 major projects (cli, api) that is already complex enough (build service for buildpack, etc), that could otherwise be made with either of the 2 python options. More so, if you consider that we already have similar codebases implementing the same CLI <-> API architecture (jobs-framework-api and jobs-framework-cli, both in python). I understand that some folks dislike the ~100 LOC embedded custom k8s library. But the solution being rewriting everything in a different lang feels a bit overkill.

For me, the reasons stated in the proposal aren't enough to justify the adoption of golang at such scale. That's why vote goes first for option 4, then for any of 3 or 2.

Moreover, the auth dancing that this new API would require is going to be very similar to what jobs-framework-api uses. This is already implemented in python. I would love to see the code refactored and reused in this new buildservice API.
In the future, if we decide so, it can also be adopted by tools-webservice (if we ever move that to an API architecture).

Moreover, the auth dancing that this new API would require is going to be very similar to what jobs-framework-api uses. This is already implemented in python. I would love to see the code refactored and reused in this new buildservice API.
In the future, if we decide so, it can also be adopted by tools-webservice (if we ever move that to an API architecture).

Thanks a lot for your input Arturo!

btw. the auth dance is similar, and that part of the api was already sorted out in golang too:

1package main
2
3import (
4 "crypto/tls"
5 "crypto/x509"
6 "flag"
7 "fmt"
8 "io/ioutil"
9 "log"
10 "net/http"
11 "time"
12)
13
14// This POC validates the tool's certificate against the k8s authority (/etc/kubernetes/pki/ca.crt on the exec nodes)
15// Example usage:
16// go run server.go -srvkey server.key -cacert k8scert.crt -srvcert server.crt
17//
18// 2022/08/26 17:23:38 Starting HTTPS server on localhost and port 12345
19// 2022/08/26 17:23:39 Got GET for 127.0.0.1:12345 from IP 127.0.0.1:38972 and common name wm-what
20// 2022/08/26 17:23:39 Sent response Hello wm-what!
21//
22// On the cli:
23// curl --insecure https://127.0.0.1:12345/ --cert /data/project/wm-what/.toolskube/client.crt --key /data/project/wm-what/.toolskube/client.key
24// Hello wm-what!
25//
26//
27//
28
29func main() {
30 srvCert := flag.String("srvcert", "", "Required, the name of the server's certificate file")
31 caCert := flag.String("cacert", "", "Required, the name of the CA that signed the client's certificate")
32 srvKey := flag.String("srvkey", "", "Required, the file name of the server's private key file")
33 flag.Parse()
34
35 usage := `usage:
36
37run server.go -cacert <caCertFile> -srvkey <serverPrivateKeyFile> -srvcert <serverCertFile>
38
39Options:
40 -cacert path to the certificate authority file
41 -srvkey path to the key file for the server
42 -srvcert path to the cert file for the server`
43
44 if *caCert == "" || *srvKey == "" || *srvCert == "" {
45 log.Fatalf("Missing cacert, srvkey or srvcert:\n%s", usage)
46 }
47
48 server := &http.Server{
49 Addr: ":12345",
50 ReadTimeout: 10 * time.Second,
51 WriteTimeout: 10 * time.Second,
52 TLSConfig: getTLSConfig("localhost", *caCert),
53 }
54
55 http.HandleFunc("/", handleRoot)
56
57 log.Println("Starting HTTPS server on localhost and port 12345")
58 if err := server.ListenAndServeTLS(*srvCert, *srvKey); err != nil {
59 log.Fatal(err)
60 }
61}
62
63func handleRoot(w http.ResponseWriter, r *http.Request) {
64 var commonName = r.TLS.VerifiedChains[0][0].Subject.CommonName
65 log.Printf("Got %s for %s from IP %s and common name %s",
66 r.Method,
67 r.Host,
68 r.RemoteAddr,
69 commonName,
70 )
71 resp := fmt.Sprintf("Hello %s!", commonName)
72 w.Write([]byte(resp))
73 log.Printf("Sent response %s", resp)
74}
75
76func getTLSConfig(host, caCertFile string) *tls.Config {
77 caCert, err := ioutil.ReadFile(caCertFile)
78 if err != nil {
79 log.Fatal("Error opening cert file", caCertFile, ", error ", err)
80 }
81 caCertPool := x509.NewCertPool()
82 caCertPool.AppendCertsFromPEM(caCert)
83
84 return &tls.Config{
85 ServerName: host,
86 ClientCAs: caCertPool,
87 ClientAuth: tls.ClientAuthType(tls.RequireAndVerifyClientCert),
88 MinVersion: tls.VersionTLS12, // TLS below 1.2 is considered insecure https://www.rfc-editor.org/rfc/rfc7525.txt
89 }
90}
91

In the future, if we decide so, it can also be adopted by tools-webservice (if we ever move that to an API architecture).

Yep, that's another reason why exploring another approach (golang) will help make a more informed decision. I would want to avoid making a decision mainly based on lack of knowledge/experience with the alternative or inertia from current practices.

I'd like to see this decisions request split into several smaller ones:

  • Should we reimplement Toolforge Buildservices as a REST API? This question is independent of the Python vs Go discussion, as the architecture in itself is language-agnostic.
  • Would it make sense to (re)implement some new and existing backend services in Go? If so, which ones? This should probably be discussed on a case-by-case basis, although I do see several overarching higher-level incentives for the team to develop some Go skills.
  • Should we migrate toolforge-cli to Go? If we adopt an API architecture, toolforge-cli would no longer have any direct interaction with k8s. Unless the intention is to use the toolforge-cli project as a playground to learn Go, I see no real incentive to migrate it.

I'd like to see this decisions request split into several smaller ones:

  • Should we reimplement Toolforge Buildservices as a REST API? This question is independent of the Python vs Go discussion, as the architecture in itself is language-agnostic.
  • Would it make sense to (re)implement some new and existing backend services in Go? If so, which ones? This should probably be discussed on a case-by-case basis, although I do see several overarching higher-level incentives for the team to develop some Go skills.

For the build service case, this one depends on the previous one, if we don't move to an API architecture, there's no rewrite needed.

For a wider scope, I agree we should check case-by-case, so probably out of scope for this decision, if we decide to write an API, and for it to be in golang, that will help deciding for the other cases.

  • Should we migrate toolforge-cli to Go? If we adopt an API architecture, toolforge-cli would no longer have any direct interaction with k8s. Unless the intention is to use the toolforge-cli project as a playground to learn Go, I see no real incentive to migrate it.

And as you hint, yes, the rewrite of the toolforge-cli is purely to learn Go deployment, packaging and best practices, no other incentive. So again, this depends on the previous point, that depends on the first one xd

I wanted to avoid having three different consecutive decisions by summarizing this one as:

  • 1) API no (if no API, there's no rewrite of anything, so no more choices)
  • 2) API yes but in golang without toolforge-cli rewrite
  • 3) API yes but in golang with toolforge-cli rewrite
  • 4) API yes but in python (if in python, rewrite of toolforge-cli is moot)

So essentially you can map the three decisions to 4 options:

  • If you want no api -> 1
  • If you want api -> 2, 3 or 4
    • if you want python -> 4
    • if you want golang -> 2 or 3
      • if you want rewrite of toolforge-cli -> 2
      • if you don't want to rewrite the toloforge cli -> 3

Does this help or do you still think we need 3 different tasks?

So essentially you can map the three decisions to 4 options:

  • If you want no api -> 1
  • If you want api -> 2, 3 or 4
    • if you want python -> 4
    • if you want golang -> 2 or 3
      • if you want rewrite of toolforge-cli -> 2
      • if you don't want to rewrite the toloforge cli -> 3

Does this help or do you still think we need 3 different tasks?

I think the decision about whether to move TBS to an API architecture warrants its own discussion, independent of the actual implementation. I understand that choosing a programming language would be an immediate follow-up decision, together with other implementation details. But as you said yourself, that discussion would be moot if we decided against the API, so this seems to me a separate question better left for later so that we all can focus on one concept at a time.

My preference would be to only discuss API vs no API to start with, then go from there.

dcaro renamed this task from Decision request - Toolforge build service to move to an API design to Decision request - <DEPENDS ON T326136> Choose a lang for the toolforge build service API.Jan 3 2023, 12:40 PM
dcaro updated the task description. (Show Details)

Moved to T326136: Decision request - Toolforge build service to move to an API design, please add comments there, this discussion will be halted until that one is resolved, thanks!

dcaro updated the task description. (Show Details)
dcaro updated the task description. (Show Details)
dcaro updated the task description. (Show Details)

A decision was made to continue with the API design on T326136: Decision request - Toolforge build service to move to an API design, this discussion can now continue, I have adapted the task description, please review, thanks

dcaro renamed this task from Decision request - <DEPENDS ON T326136> Choose a lang for the toolforge build service API to Decision request - Choose a lang for the toolforge build service API.Feb 6 2023, 10:15 AM

I find it hard to decide personally, so I will not cast a vote for one of the options.

I think there is value in using Python because that's the main language used in the team, everyone is already familiar with it, and I'm sure it can handle the requirements of this project pretty well. But I also see value in using Go, mainly because of the typing support, but also as a way to increase the skills of the team (myself included) in a language that is becoming more and more relevant in the SRE space.

I think there is value in using Python because that's the main language used in the team, everyone is already familiar with it, and I'm sure it can handle the requirements of this project pretty well. But I also see value in using Go, mainly because of the typing support, but also as a way to increase the skills of the team (myself included) in a language that is becoming more and more relevant in the SRE space.

Agree, I lean towards the golang option as it allows us to become more future proof, in the sense of having a better view of both options, and expanding our knowledge and skillset to more than one stack (that we are already using, so not really introducing much new, just expanding on the best practices).

Of course, if we develop it as we have been so far, more than one person at a time, otherwise the knowledge does not get shared.

I lean toward option 1. I think the pros outweigh both the cons of this option, and the pros of option 2.

From my personal experience with Go so far, it's a relatively easy language to get started with, e.g. compared to Rust. The syntax should be familiar to anyone with some experience with any C-type language, and core Go is by design minimal and devoid of any magic and syntactic sugar. The Gophers Slack community is also very active and helpful – anytime I've asked a question there, someone has helped me out within an hour, or sometimes even minutes.

I'd probably keep the toolforge-cli in Python, though. As k8s interactions become intermediated by the API, I see no obvious reasons to spend time porting it to Go.

my opinion on this is in two parts:

  1. If the plan to work on this and maybe get it done in a month or less, we should go with option 4.
  2. If we are ready to spend more time on this (seems like we are), then we can decide to build this with golang, option 2

It was decided today in the meeting to use Golang for the API service.