In T341553#10574262, @Urbanecm_WMF raises the point that it's difficult to use kubectl to list mwscript-k8s jobs with their script name, args, and job status, and asks for a dedicated command to do so.
Context:
So far, our intention has been that interactions with maintenance script Kubernetes jobs (and pods, and containers, and so on) should be done using the regular Kubernetes tools. The one exception is the mwscript-k8s tool for launching a new job, which wraps the regular tools with maintenance-script-specific logic -- that was necessary because of the way maintenance scripts' Helm chart and helmfile are implemented. As long as we needed a wrapper script anyway (to generate a new release name for each run, for example) we took the opportunity to design a UI that was, if not exactly the same as the familiar pre-Kubernetes mwscript tool, at least familiar.
We intended for that to stand on its own, not to be the first of a fleet of mwscript-specific Kubernetes tools. Instead, operators of maintenance scripts -- who are now operators of Kubernetes jobs -- are expected to learn their way around the Kubernetes interface enough to get their work done (and for common cases, the mwscript-k8s launcher prints out the kubectl commands you need, for easy pasting). This isn't unprecedented; on the pre-Kubernetes mwmaint hosts, the only way to inquire about running maintenance scripts is with standard commands like ps, learning them first if necessary.
This isn't only a philosophical stance. Any tools we write on top of the Kubernetes API will necessarily be incomplete. We can write maintenance-specific wrappers for as many common kubectl commands as we can think of (spending as much engineering time as necessary) but as soon as something unexpected happens that these wrappers don't support, operators will still have to know how to use the real kubectl tools to manage their job -- only now they'll be further away from the information that makes this possible.
For all these reasons I still think we shouldn't write more mwscript-specific Kubernetes CLI wrappers.
The reason I open this task anyway is that @Urbanecm_WMF is also right that kubectl get job is an extremely difficult tool for certain common tasks that ought to be easy. It's not unusual to want to read the mwscript args for a running job; kubectl can do this, but only by either using a custom column definition or emitting and parsing JSON, both of which are long unwieldy commands.
We added labels for username and script, to make it easy to pass e.g. -L username to kubectl get, but labels have a maximum length of 63 characters, so we can't put the args there. We could put them in an annotation, but that doesn't make it substantially easier to include them in kubectl get output; you'd still need a custom column or JSON to get at the annotation.
Some options:
- We could do nothing. This would be technically usable but frustrating.
- We could write a new command like mwscript-k8s-list. As above, I don't think we should do this, but I could maybe live with it for this specific case. For transparency, I'd want it to print the underlying kubectl command before invoking it.
- We could document the long unwieldy commands on Wikitech for easy copy-pasting. That at least saves people from having to keep them in their own notes (or their own shell histories, or their own aliases) while still being easier (ish) to modify them as needed.
- We could write a web frontend for checking on running jobs. Despite the above, I would actually really like this; even though operators would still need to drop down to the CLI to do much more than the basics, the UI could be a lot more usable, not just a little, and that would make it worth doing despite the limitations. But it would also be a bigger engineering project, and I don't think ServiceOps new has the resources for it right now.