Page MenuHomePhabricator

Get phawikibugs to run on tools.wmflabs
Closed, ResolvedPublic

Description

PhabricatorBot officially needs to be started using as a daemon under phd (phabricator daemon), but this is not possible under SGE.

This doesn't work:

jsub -mem 1G ./phabricator/bin/phd debug PhabricatorBot  `pwd`/phab_config.json

but causes large amounts of debug output. Other method of starting the bot cause forks, forks everywhere, and we end up with random processes running on random hosts -- while the SGE job no longer exists.

Another option would be to set up a host/project for this alone, so we can use phab's process/daemon management.

Related Objects

View Standalone Graph
This task is connected to more than 200 other tasks. Only direct parents and subtasks are shown here. Use View Standalone Graph to show more of the graph.

Event Timeline

valhallasw raised the priority of this task from to Lowest.
valhallasw updated the task description. (Show Details)
valhallasw changed Security from none to None.

Okay, scratch that. Phd *also* happily forks the bot and stops, even in debug mode. phawikibugs is still on irc, but qstat does not show any phd process.

with a bit of bash magic:

for i in {01..14}; do echo $i `ssh tools-exec-$i "ps aux | grep phab | grep -v bash | grep -v grep"`; done

we find it's running on tools-exec-10:

51894 8350 1.2 0.5 380700 47096 ? Ss Oct07 8:50 php ./exec_daemon.php PhabricatorBot --trace --load-phutil-library=/data/project/wikibugs/phab_is_retarded/arcanist/src --load-phutil-library=/data/project/wikibugs/phab_is_retarded/phabricator/src --verbose -- /data/project/wikibugs/phab_is_retarded/phab_config.json

which is not one of the hosts that went down last night (due to the virt1005 crash), so that's not the origin of the issue.

Qgil raised the priority of this task from Lowest to Medium.
Qgil moved this task from Backlog to Need discussion on the Bugzilla-Migration board.

Answer from Phabricator:

14:09 <@epriestley> valhallasw`cloud: foregrounding is not supported

In the meanwhile, I'm going to see if strace-ing

./libphutil/scripts/daemon/exec/exec_daemon.php PhabricatorBot --load-phutil-library=arcanist/src --load-phutil-library=phabricator/src $@ -- ./phab_config.json

will be of any use. Unfortunately, SGE is down at the moment, so I can't test.

This was the cause: https://github.com/phacility/libphutil/blob/master/src/daemon/PhutilDaemon.php#L47

if (!posix_isatty(STDOUT)) {
   posix_kill(posix_getppid(), SIGUSR1);
 }

Apparently SIGUSR1 is used to tell the parent process the child is still alive... but the default SIGUSR1 action is 'exit'.

Fixing this with

function on_USR1()
{
  echo Received SIGUSR1
}

trap on_USR1 USR1

means the process doesn't disappear anymore, but the bot is also not killed with qdel. Why did the Phabricator team decide to re-invent the wheel...?

Few other options I've tried:

  • script. Doesn't work at all.
  • script-with-screen. Same
  • using phd with an on_EXIT handler:
function on_EXIT()
{
  echo Stopping all daemons...
  phabricator/bin/phd stop
  echo Done.
}

function on_USR1()
{
  echo Received SIGUSR1
}

trap on_USR1 USR1
trap on_EXIT EXIT

echo Starting daemon via PHD!
phabricator/bin/phd launch phabricatorbot `pwd`/phab_config.json
echo Started. Sleep for ALL times!

Starting works, killing doesn't.

And thanks to @coren I now have phawikibugs running under SGE in a sensible way. The SGE config was changed to send SIGINT instead of SIGKILL, which makes the wrapper script

function on_EXIT()
{
  echo Stopping all daemons...
  phabricator/bin/phd stop
  echo Done.
}

function on_USR1()
{
  echo Received SIGUSR1
}

trap on_USR1 USR1
trap on_EXIT EXIT

echo Starting daemon via PHD!
phabricator/bin/phd launch phabricatorbot `pwd`/phab_config.json
echo Started. Sleep for ALL times!

work like a charm.