The Configuration Matrix has two main problems:
- It has lots of infrastructure overhead by spawning new jobs for each database update.
- The "Configuration Matrix" plugin for Jenkins is not stable and causes deadlocks on a daily bases (where it waits for executors on the deployment-bastion.eqiad node which clearly has sufficient available executors). See T72597. This particular deadlock has only ever been observed on the deployment-bastion.eqiad node from a beta-update-databases-eqiad job.
Please rewrite this Jenkins job to use instead spawn parallel child processes within a single job. E.g. using bash, python, or otherwise. The current job clearly doesn't work properly and it doesn't look like we're getting any closer with the various upstreams (Gearman, Jenkins) to figure out what is causing this deadlock.
The only tricky bit is orchestrating the console output. A few ideas:
- Write it to separate files collected as build artefact.
- Or; Buffer it and send to stdout one at a time.
- Or; Send all to stdout with a unique prefix.