Page MenuHomePhabricator

qsub by default sends jobs with `-l release=precise`, which cannot be satisfied
Closed, ResolvedPublic

Description

'm trying to run a script via de jobgrid and it doesn't run. Checking the jobqueue I see it's stuck on queued/waiting.
More details show that it can't run using default settings,
This is my crontab entry:

0 12,0 * * * qsub -m n $HOME/cleaner.sh

And these are the details of the job:

tools.ytcleaner@tools-bastion-03:~$ qstat -j 6307824
==============================================================
job_number:                 6307824
exec_file:                  job_scripts/6307824
submission_time:            Fri May 25 00:01:16 2018
owner:                      tools.ytcleaner
uid:                        52856
group:                      tools.ytcleaner
gid:                        52856
sge_o_home:                 /data/project/ytcleaner
sge_o_log_name:             tools.ytcleaner
sge_o_path:                 /usr/bin:/bin
sge_o_shell:                /bin/sh
sge_o_workdir:              /mnt/nfs/labstore-secondary-tools-project/ytcleaner
sge_o_host:                 tools-cron-01
account:                    sge
stderr_path_list:           NONE:NONE:$HOME/log/cron-tools.ytcleaner-1.err
hard resource_list:         h_vmem=256M,release=precise
mail_list:                  mbch331.wikipedia@gmail.com
notify:                     FALSE
job_name:                   cron-tools.ytcleaner-1
stdout_path_list:           NONE:NONE:/dev/null
jobshare:                   0
env_list:
script_file:                /data/project/ytcleaner/cleaner.sh
scheduling info:            (-l h_vmem=256M,release=precise) cannot run in queue "webgrid-lighttpd@tools-webgrid-lighttpd-1424.tools.eqiad.wmflabs" because job requests unknown resource (release)
                            has no permission for cluster queue "giftbot"
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-webgrid-lighttpd-1405.eqiad.wmflabs" because it offers only hc:h_vmem=0.000
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-webgrid-lighttpd-1401.eqiad.wmflabs" because it offers only hc:h_vmem=0.000
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1437.tools.eqiad.wmflabs" because it offers only hc:h_vmem=220.000M
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-webgrid-generic-1403.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-webgrid-generic-1404.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-webgrid-generic-1401.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-webgrid-generic-1402.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-webgrid-lighttpd-1413.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-webgrid-lighttpd-1407.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1441.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1431.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-webgrid-lighttpd-1428.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1406.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1411.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-webgrid-lighttpd-1408.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-webgrid-lighttpd-1425.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-webgrid-lighttpd-1402.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-webgrid-lighttpd-1411.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-webgrid-lighttpd-1427.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-webgrid-lighttpd-1417.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-webgrid-lighttpd-1414.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-webgrid-lighttpd-1410.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-webgrid-lighttpd-1406.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-webgrid-lighttpd-1418.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-webgrid-lighttpd-1404.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-webgrid-lighttpd-1415.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-webgrid-lighttpd-1426.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1439.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1419.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-webgrid-lighttpd-1416.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1409.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1413.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1424.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1428.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1422.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1432.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1405.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1426.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1425.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1434.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1401.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1410.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1403.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1415.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1412.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1407.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1427.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1435.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1430.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1436.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1402.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1421.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1429.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1417.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1414.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1433.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1442.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1440.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1420.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1404.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1416.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1418.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1423.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-webgrid-lighttpd-1409.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-webgrid-lighttpd-1419.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1408.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-exec-1438.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-webgrid-lighttpd-1421.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-webgrid-lighttpd-1412.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-webgrid-lighttpd-1420.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-webgrid-lighttpd-1422.tools.eqiad.wmflabs" because it offers only hf:release=trusty
                            (-l h_vmem=256M,release=precise) cannot run at host "tools-webgrid-lighttpd-1403.eqiad.wmflabs" because it offers only hf:release=trusty

Manually running the cleaner.sh script from command line gives no issues. Do I need to change anything to my cronjob or is something else wrong?

Event Timeline

For some reason qsum defaults to release=precise (this should be fixed but idk how). Use jsub instead, or specify -l release=trusty explicitly.

Mentioned in SAL (#wikimedia-cloud) [2018-05-25T05:31:20Z] <zhuyifei1999_> Edit /data/project/.system/gridengine/default/common/sge_request, h_vmem 256M -> 512M, release precise -> trusty T195558

Seems to be working.
6320389 0.30000 cron-tools tools.ytclea r 05/25/2018 05:51:12 webgrid-generi c@tools-webgrid- 1
And I get an e-mail from my script, which means it has run.

zhuyifei1999 claimed this task.

Please use the task queue for one-time run tasks, instead of the webgrid-genericqueue, which is meant for webservices.

zhuyifei1999 renamed this task from Cronjob stuck in jobgrid to qsub by default sends jobs with `-l release=precise`, which cannot be satisfied.May 25 2018, 6:01 AM
Vvjjkkii renamed this task from qsub by default sends jobs with `-l release=precise`, which cannot be satisfied to 5acaaaaaaa.Jul 1 2018, 1:08 AM
Vvjjkkii reopened this task as Open.
Vvjjkkii removed zhuyifei1999 as the assignee of this task.
Vvjjkkii triaged this task as High priority.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed a subscriber: Aklapper.
Mbch331 renamed this task from 5acaaaaaaa to qsub by default sends jobs with `-l release=precise`, which cannot be satisfied .Jul 2 2018, 9:07 AM
Mbch331 closed this task as Resolved.
Mbch331 assigned this task to zhuyifei1999.
Mbch331 raised the priority of this task from High to Needs Triage.
Mbch331 updated the task description. (Show Details)
Mbch331 added a subscriber: Aklapper.