The toolsbeta grid is down. We need to fix it so it can serve its purpose of being a staging/devel environment for tools proper.
toolsbeta.automated-toolforge-tests@toolsbeta-sgebastion-05:~$ qstat error: commlib error: got select error (Connection refused) error: unable to send message to qmaster using port 6444 on host "toolsbeta-sgegrid-shadow.toolsbeta.eqiad1.wikimedia.cloud": got send error
arturo@nostromo:~$ cookbook wmcs.toolforge.grid.get_cluster_status --project toolsbeta START - Cookbook wmcs.toolforge.grid.get_cluster_status PASS | | 0% (0/1) [00:07<?, ?hosts/s] FAIL |███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100% (1/1) [00:07<00:00, 7.19s/hosts] Exception raised while executing cookbook wmcs.toolforge.grid.get_cluster_status: Traceback (most recent call last): File "/home/arturo/git/wmf/operations/software/spicerack/spicerack/_menu.py", line 234, in run raw_ret = runner.run() File "/home/arturo/git/wmf/operations/cookbooks/cookbooks/wmcs/toolforge/grid/get_cluster_status.py", line 99, in run nodes_info = self.grid_controller.get_nodes_info() File "/home/arturo/git/wmf/operations/cookbooks/cookbooks/wmcs/libs/grid.py", line 314, in get_nodes_info xml_output = run_one_raw(node=self._master_node, command=["qhost", "-q", "-xml"], print_output=False) File "/home/arturo/git/wmf/operations/cookbooks/cookbooks/wmcs/libs/common.py", line 423, in run_one_raw result = next(node.run_sync(command, **kwargs)) File "/home/arturo/git/wmf/operations/software/spicerack/spicerack/remote.py", line 520, in run_sync return self._execute( File "/home/arturo/git/wmf/operations/software/spicerack/spicerack/remote.py", line 720, in _execute raise RemoteExecutionError(ret, "Cumin execution failed") spicerack.remote.RemoteExecutionError: Cumin execution failed (exit_code=2) END (FAIL) - Cookbook wmcs.toolforge.grid.get_cluster_status (exit_code=99)
arturo@nostromo:~$ cookbook wmcs.toolforge.tests --bastion-hostname toolsbeta-sgebastion-05 --project toolsbeta START - Cookbook wmcs.toolforge.tests ----- OUTPUT of 'sudo -i cmd-chec...forge-tests.yaml' ----- [2022-09-28 09:12:34] INFO: --- toolsbeta-sgebastion-05 Debian GNU/Linux 10 (buster) 4.19.0-19-cloud-amd64 [2022-09-28 09:12:34] INFO: --- [...] [2022-09-28 09:20:01] INFO: --- [2022-09-28 09:20:01] INFO: --- passed tests: 9 [2022-09-28 09:20:01] INFO: --- failed tests: 11 [2022-09-28 09:20:01] INFO: --- total tests: 20