Page MenuHomePhabricator

Decision request – Toolforge CLI consolidation
Closed, ResolvedPublic

Description

Background

One of the outcomes of T346153: Decision request – Toolforge (re)architecture is that we have decided to move forward with consolidating all the existing Toolforge CLIs into a single one. This is a follow-on decision request to decide on the details of how this should be done.

The CLIs are:

  • jobs
  • webservice
  • build
  • envvars
  • additionally, there's the toolforge wrapper

Decision Record

https://wikitech.wikimedia.org/wiki/Wikimedia_Cloud_Services_team/EnhancementProposals/Decision_record_T348749_Toolforge_CLI_consolidation

Risks & Constraints

  • We should avoid breaking the current user experience and overall CLI semantics
  • Regardless of the option chosen, care must be taken to not introduce or carry over technical debt that could complicate future development
  • Each option will require varying levels of development effort and time, which should be considered in the context of other ongoing projects and priorities

Option 1

Merge the existing Python CLIs

Option 2

Rewrite a CLI from scratch in Go

Option 3

Create a unified Open API definition for the CLI first, then autogenerate a Go client from it

Option 4

Create a unified Open API definition for the CLI first, then autogenerate a Python client from it
Note: This option was added during the decision meeting

Option N

Add your option here

The Pros & Cons for each option will be added after discussion, but before voting.

Edit: summary of pros/cons

1. Merge Existing Python CLIs

Pros:

  • Uses existing Python code and expertise
  • Potentially quicker to implement due to familiarity
  • More accessible to new and less experienced contributors

Cons:

  • Python's performance is generally slower compared to Go (cli speed is however not a bottleneck)
  • Lacks single-binary packaging which can be a limitation
  • Might require more effort for packaging and distribution

2. Rewrite CLI from Scratch in Go

Pros:

  • Benefits from Go's performance and efficiency
  • Enables single binary packaging, simplifying distribution
  • Full control over design and implementation from the start

Cons:

  • Inability to reuse existing Python codebase
  • Go's ecosystem is less rich compared to Python, which may limit certain functionalities
  • Potentially longer development time due to Go's relative verbosity and learning curve

3. Create a Unified Open API Definition, Autogenerate a Go Client

Pros:

  • Lowers long-term maintenance due to autogeneration
  • Unified API could provide broader benefits and standardization
  • Reduces human error in coding
  • Ensures comprehensive API documentation

Cons:

  • Time-intensive to create a complete and effective Open API specification
  • Autogenerated code may not fully leverage Go's features, possibly leading to less optimized solutions
  • Reliance on specific tooling and technologies for autogeneration
  • Like option 2, inability to reuse the existing Python codebase

Event Timeline

Slst2020 changed the task status from Open to In Progress.Oct 12 2023, 1:39 PM
Slst2020 moved this task from Next Up to In Progress on the Toolforge (Toolforge iteration 01) board.

Some aspects I see for each option:

Option 1

Pros:

  • We keep using python of which we might have more experience with
  • A lot of the current code can be reused

Cons:

Option 2

Pros:

  • Single binary out of the box
  • We know some golang
  • Total control over what we want

Cons:

  • Current code can't be reused

Option 3

Pros:

  • Probably the least maintenance effort (autogenerated)

Cons:

  • Needs to have a unified API definition, something that we might want anyhow (allows generating many language libraries and clis)
  • Current code can't be reused

Agree with @dcaro's points, and also:

Option 1

Pros:

  • Reuse of code would likely speed up the migration
  • Allows us to do a gradual migration, merging one CLI at a time
  • New contributors may be more familiar with Python than Go

Cons:

  • Python is generally slower than Go (although that may not matter much in this specific case as the real speed bottleneck probably is network latency)

Option 2

Pros:

  • Go is generally faster, although see above

Cons:

  • Go's ecosystem is not yet as rich as Python's
  • Go can be quite verbose for some tasks for which Python has higher-level abstractions; longer development time

Option 3

Same as for Option 2, and also:
Pros:

  • Autogeneration minimizes the risk of human error
  • Open API specifications can be easily converted into comprehensive API documentation

Cons:

  • Autogenerated code may not take full advantage of language-specific features or optimizations, potentially leading to less idiomatic or less optimized Go code.
  • Introduces a dependecy on the tooling around Open API and code generation

Allows us to do a gradual migration, merging one CLI at a time

That can be done with the option 2 also (the current modular design allows for easy plucking of each cli), and probably option 3 (depends on how we generate the clis, we already started looking into this when generating the builds cli from the api spec at the beginning, though we dropped that effort in favor of python).

Autogenerated code may not take full advantage of language-specific features or optimizations, potentially leading to less idiomatic or less optimized Go code.

Note that in the case of the autogenerated cli, we might not want any contributions (as we might not have to code anything there), case in which being less idiomatic becomes a non-issue (that imo is a big plus). Also as you mention in the option 1, performance is not a bottleneck, so language-specific optimizations are not that relevant either.

I don't have a strong preference, but I vote for Option 1 as I would prefer to tackle the different objectives separately:

  1. consolidating the CLI
  2. migrating to Go (can be done later if we like)
  3. creating an OpenAPI definition (can be done later if we like)

Trying to achieve 2 or 3 things together seems (slightly) more risky and time-consuming, so my preference is to focus on objective 1, using Option 1.

I vote for option 3, as it's the one that will require less effort duplication, given that the api definiton is something that we want to do anyhow.
It achieves the cli consolidation without having to change any cli code, or rewrite any code, by generating it from the API definition.
And it achieves the binary cli by generating it in golang/java/whatever binary compiled lang we want.

As opposed to having to put effort to consolidate the cli, then more effort rewriting it in a compiled lang to distribute a binary, and then more effort again creating the API definition + cli generation.

What would "autogenerated CLI" mean? I remaing sceptical that you can reliably do something more than generate an OOP wrapper around the API methods, which would do nothing about parsing and wiring up parameters, dealing with output formatting, handling changes and deprecations in functionality, etc.

For option 3, once the openapi specification is complete, I presume you could also generate a python client? It seems the implied goal is to end up with go client binary, but the openapi option seems distinct from option 1 or 2?

In other words, I'm reading this request as follows. The toolforge cli is being consolidated. Choices are: Do it in python or do it in golang. And further, use openapi to do it (or not ). Is this accurate?

Any thoughts on complexity for each option? A casual reading suggests they may be in order of increasing complexity or close.

What would "autogenerated CLI" mean? I remaing sceptical that you can reliably do something more than generate an OOP wrapper around the API methods, which would do nothing about parsing and wiring up parameters, dealing with output formatting, handling changes and deprecations in functionality, etc.

There is always a few things that will need some manual coding, but the idea is to leave those at a the minimum possible and use the autogenerated code in as many places as we can. The 'basic/poc' generated code we used for the bulids-service already wrote the cli subcommands, parser, help strings and parameter validation itself, if that's what you are skeptical about.

In the worst case (that I highly doubt will be), generating the libraries out of the API is already a huge improvement.

For option 3, once the openapi specification is complete, I presume you could also generate a python client?

Yes, you can generate libraries + client on many languages.

It seems the implied goal is to end up with go client binary, but the openapi option seems distinct from option 1 or 2?
In other words, I'm reading this request as follows. The toolforge cli is being consolidated. Choices are: Do it in python or do it in golang. And further, use openapi to do it (or not ). Is this accurate?

We mention golang here as one of the goals discussed in the previous decision task was to be able to "easily install the client for users", and having a single self-contained binary to download as generated by golang is a big plus there, versus having to download a debian package + dependencies (that is what option 1 will have). See T346153#9243305

Any thoughts on complexity for each option? A casual reading suggests they may be in order of increasing complexity or close.

I would say that the easiest might be option 1 (depending on how we migrate them), as in just needs gluing the existing code together, though might need some refactors.
The more complex would be option 2, as it requires a manual rewrite of every cli.
And in between would be option 3, that needs a bit of effort to get the API definition, and a bit of effort and exploration to get the generated client usable.

Note that none of these options mentions that the webservice cli needs some modifications to be consolidated with the rest, and it deals with both k8s and grid (and it's quite complex), so I'd migrate the k8s side of it to an API before doing so and use that only on the cli. That pre-effort is needed and shared with all of the options.

In my opinion I think we should go with Option 1 in the short term and Option 3 in the long term. Option 2 is totally out of the question in my opinion because it offers no real benefit over Option 1 (the one advantage it has, speed, is not really a bottleneck for the type of application we are building so doesn't mean much, yet we have to worry about the time it will take to complete).
Option 1 in the short term because we want to start enjoying the benefits of the merge as soon as possible and Option 3 will likely take a while before we can come up with a complete openapi spec.
Option 3 in the long term because it helps us standardize the way we develop stuffs for toolforge. I can see the spec created for Option 3 being used in the future for something like toolforge UI

I vote for option 3, as it's the one that will require less effort duplication, given that the api definiton is something that we want to do anyhow.
It achieves the cli consolidation without having to change any cli code, or rewrite any code, by generating it from the API definition.
And it achieves the binary cli by generating it in golang/java/whatever binary compiled lang we want.

As opposed to having to put effort to consolidate the cli, then more effort rewriting it in a compiled lang to distribute a binary, and then more effort again creating the API definition + cli generation.

+1 to all of this.

If Option 3 is the end goal, passing through Option 1 will just be time consuming without adding enough additional benefits, in my opinion.

Option 1 in the short term because we want to start enjoying the benefits of the merge as soon as possible and Option 3 will likely take a while before we can come up with a complete openapi spec.

Maybe I'm misunderstanding how this works, but I think we don't have to create the spec all at once? Even if this is the case, I think it would be worth the extra wait/effort. As @dcaro mentioned, a unified API definition is probably something that we might want anyhow.

tl;dr I vote for option 3

Slst2020 changed the task status from In Progress to Stalled.Nov 7 2023, 2:47 PM
Slst2020 changed the task status from Stalled to In Progress.Jan 24 2024, 2:10 PM

As there is no clear consensus, a decision meeting will be scheduled as described here: https://www.mediawiki.org/wiki/Wikimedia_Cloud_Services_team/Decision_Making

As there is no clear consensus, a decision meeting will be scheduled as described here: https://www.mediawiki.org/wiki/Wikimedia_Cloud_Services_team/Decision_Making

Option 4 was chosen.