Page MenuHomePhabricator

ToolsGridQueueProblem - Grid queue is in state E
Closed, ResolvedPublic


From alertmanager (

alertname: ToolsGridQueueProblem
summary: Grid queue is in state E
9 hours ago
instance: tools-sgegrid-master
job: node
queue: webgrid-lighttpd
severity: warn
state: E
@receiver: cloud-admin-feed

Event Timeline

dcaro changed the task status from Open to In Progress.Nov 1 2022, 8:41 AM
dcaro triaged this task as High priority.
dcaro created this task.
dcaro moved this task from To refine to Doing on the User-dcaro board.

Mentioned in SAL (#wikimedia-cloud-feed) [2022-11-01T09:37:27Z] <wm-bot2> cleaned up grid queue errors on tools-sgegrid-master (T322110) - cookbook ran by dcaro@vulcanus

Checked the status with:

dcaro@vulcanus$ cookbook wmcs.toolforge.grid.get_cluster_status --only-failed --project tools

(and using

that showed that there was only one error, and it was due to the epilog issue, so just cleaned up the queues and everything is back to normal.

dcaro moved this task from Doing to Done on the User-dcaro board.