Background
The Toolforge API gateway is responsible for authenticating and routing API requests to control Toolforge tools. Right now the authentication part is implemented via client certificates signed by the Kubernetes cluster client CA. This works okay for now, but blocks implementing some new features or improvements. For that reason we need to implement a new authentication system for various API clients to use.
Requirements
Here are some use cases which we should account for when designing the new system:
- Credentials for single tools, like the current ones provisioned for each tool's NFS home directory
- Credentials which can access all tools maintained by a single user, for example for Terraform or some other locally installed CLI tool
- Striker should have credentials which can access all tools
- The credentials should have some security features built in, like limiting which APIs a credential can be used with and tracking when it is used.
Implementation
The API gateway is currently implemented with Nginx, which supports using an external API server to authenticate API requests. It's also possible to switch to using something like Envoy which has built-in support for for various authentication methods.
I think the best way is to implement OAuth2, which is the industry standard. That would involve a separate authentication server, which would implement the authentication methods described above and then issue short-lived bearer tokens that the API gateway can validate and get user data from. The authentication server needs Toolforge-specific "business logic" which makes me think it needs to be custom software. In the simplest form it can issue bearer tokens based on the existing Kubernetes certificate authentication, and then we can expand it to cover other use cases.
Proposal
There is a draft here: https://wikitech.wikimedia.org/wiki/User:Majavah/EnhancementProposals/Toolforge_API_OAuth_2_support