Page MenuHomePhabricator

Job 948624 stuck on Toolforge grid
Closed, ResolvedPublic

Description

The job 948624 is stuck since 03/18/2019 on Toolforge grid and I cannot delete it with qdel.

Event Timeline

Incola created this task.Mar 20 2019, 11:33 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 20 2019, 11:33 AM
aborrero triaged this task as Normal priority.
aborrero moved this task from Inbox to Doing on the cloud-services-team (Kanban) board.
aborrero closed this task as Resolved.Mar 20 2019, 11:54 AM

I forced deletion of the job:

tools.incolabot@tools-sgebastion-07:~$ qstat
job-ID  prior   name       user         state submit/start at     queue                          slots ja-task-ID 
-----------------------------------------------------------------------------------------------------------------
 948624 0.26387 bar        tools.incola dt    03/18/2019 16:00:21 task@tools-sgeexec-0942.tools.     1
root@tools-sgebastion-07:~# qdel -f 948624
root forced the deletion of job 948624

However, I couldn't find which procs belonged to your tool in tools-sgeexec-0942 so I couldn't do any additional investigation of why that happened.

Also, did you started the tool just after I stopped the job?

tools.incolabot@tools-sgebastion-07:~$ qstat
job-ID  prior   name       user         state submit/start at     queue                          slots ja-task-ID 
-----------------------------------------------------------------------------------------------------------------
 985301 0.25000 bar        tools.incola r     03/20/2019 11:50:18 task@tools-sgeexec-0924.tools.     1

And BTW, this is in your service.manifest file:

tools.incolabot@tools-sgebastion-07:~$ cat service.manifest 
# This file is used by toollabs infrastructure.
# Please do not edit manually at this time.
backend: kubernetes
distribution: debian
version: 2
web: php5.6

It contains backend: kubernetes, weird, if you are using the grid.

Thanks! The tool runs periodically to update some pages on the Italian Wikipedia, so it might have started just after you stopped the job. I think that kubernetes is for the web interface.