Page MenuHomePhabricator

tools: Grid spawned multiple instances of a "-once -continues" type job
Closed, ResolvedPublic

Description

Both on tools.ecmabot and tools.wmfdbbot, the main application process was running multiple copies causing weird issues on IRC while I was on holiday.

ecmabot

$ qstat
3224838 0.31298 ecmabot-wm tools.ecmabo r 08/17/2014 08:38:25 continuous@tools-exec-13.eqiad 1
3246425 0.31064 ecmabot-wm tools.ecmabo r 08/18/2014 03:48:25 continuous@tools-exec-07.eqiad 1
3247911 0.31048 ecmabot-wm tools.ecmabo r 08/18/2014 05:06:25 continuous@tools-exec-11.eqiad 1
3313822 0.30323 ecmabot-wm tools.ecmabo r 08/20/2014 16:16:18 continuous@tools-exec-13.eqiad 1
3313991 0.30320 ecmabot-wm tools.ecmabo r 08/20/2014 16:26:18 continuous@tools-exec-06.eqiad 1
3338423 0.30054 ecmabot-wm tools.ecmabo r 08/21/2014 14:10:18 continuous@tools-exec-07.eqiad 1
$ cat crontab.txt
*/2 * * * * /usr/bin/jsub -N ecmabot-wm -once -continuous -quiet -stderr -mem 1700M node ~/apps/oftn-bot/wm-ecmabot.js > /dev/null 2>&1

wmfdbbot

$ qstat
1976888 0.45298 lighttpd-w tools.wmfdbb r 06/30/2014 18:20:02 webgrid-lighttpd@tools-webgrid 1
1976925 0.45298 dbbot-wm tools.wmfdbb r 06/30/2014 18:20:17 continuous@tools-exec-06.eqiad 1
3239087 0.31142 dbbot-wm tools.wmfdbb r 08/17/2014 21:25:25 continuous@tools-exec-07.eqiad 1
3277316 0.30723 dbbot-wm tools.wmfdbb r 08/19/2014 07:35:25 continuous@tools-exec-07.eqiad 1
3325477 0.30197 dbbot-wm tools.wmfdbb r 08/21/2014 02:30:33 continuous@tools-exec-04.eqiad 1
$ cat crontab.txt
*/5 * * * * /usr/bin/jsub -N dbbot-wm -once -continuous -quiet -mem 1280M -o /dev/null php ~/apps/ts-krinkle-Kribo/Init.php

I've qdel'ed both job names and let cron start a new one. Reporting here for someone to perhaps look into since this should not've happened.


Version: unspecified
Severity: normal

Details

Reference
bz69867

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 3:33 AM
bzimport added a project: Toolforge.
bzimport set Reference to bz69867.
  • This bug has been marked as a duplicate of bug 60862 ***