Problem
Currently Striker (toolsadmin.wikimedia.org) exposes the only user-accessible method to create and modify tool accounts. It does this by having LDAP credentials with full write access and then exposing a web interface for authenticated interfaces to use.
There are, however, several other use cases that need to store persistent information about tools:
- The tool account deletion process and the related mark_tool CLI, which run on a cloudcontrol host with full LDAP write credentials
- Other parts of the tool deletion process, which use files on NFS and a MariaDB database on ToolsDB
- The SRE offboarding tool will remove users from certain privileged tools
- A future hypothetical user-installable Toolforge CLI tool could benefit from commands to create/update tools
- Similarly T393010: [DRAFT] Decision Request - Initial product approach to integrating Toolforge UI functionality with Toolsadmin might go in a direction where it needs to write to LDAP
Plus there are numerous other workflows that consume read-only tool information that are not listed here.
All of these use separate code to read and possibly write tool entries in LDAP.
Direct LDAP write access has historically been restricted to the wikiprod realm. This is one of the main reasons why Striker runs on cloudweb* hardware and not on a Cloud VPS VM or in the Toolforge k8s cluster.
Constraints and risks
TBD
The other highly privileged action that Striker does, adding and removing members from the tools Cloud VPS project, is out of scope here. That is an OpenStack operation meaning we could already restrict with custom roles and more specific policies.
Options
Option 1
Do nothing.
Pros:
- No engineering work required
- It already works
Cons:
- All future use cases will need to come up with their ad hoc code and think about deployment considerations for LDAP write access
- Some options (e.g. user-installable CLIs) will not be possible to implement
Option B
Keep the LDAP logic in Striker, and expose an API that all write operations will use.
Pros:
- Minimal engineering work (Striker is already deployed and has LDAP credentials that work)
- Will unlock some features that would otherwise not be possible
Cons:
- Heavy coupling between frontend/UI features and backend logic
- Will need to come up with a way to authenticate those API calls
- Does not unblock moving Striker off of wikiprod hardware
Option Γ (Gamma)
As option B, but also migrate all read operations to the new API where possible.
Pros:
- All LDAP logic gets consolidated in one place, everything else is "just" standard HTTP calls
- Allows better caching and such
Cons:
- More engineering work than in option B
- All other cons of option B apply
Option Beryllium
Write a new backend service which will consolidate all LDAP writing logic and expose a standard HTTP API for those operations.
Pros:
- Uncouples UI and backend logic
- Unblocks various new use cases
Cons:
- Requires most engineering work to implement
- Increases system complexity by introducing yet another service
- Will need to come up with a way to authenticate those API calls
- Will not enable API editing for resources currently managed in Striker (including toolinfo records, GitLab repos, Phab projects, membership requests)
Option Purple
As option Beryllium, but also consolidate all read operations to the new service when possible.
Pros:
- All LDAP logic gets consolidated in one place, everything else is "just" standard HTTP calls
- Allows better caching and such
Cons:
- Even more upfront engineering logic than in option Beryllium
- All other cons of option Beryllium apply
Option Games
Introduce APIs for everything Striker can do (like in option Γ), then move the frontend code somewhere else
Pros:
- Uncouples UI and backend logic
- Unblocks various new use cases
Cons:
- Engineering work required to implement
- Django might not be the best tool to implement a headless backend
- Hard to migrate UI components one at a time