Page MenuHomePhabricator

[Epic] Write basic process-control, something good enough to run all CRM jobs.
Closed, ResolvedPublic


This is urgent, blocking work so please help limit scope. This task is finished when the following features are verified working:

Job migration status:


  • Ops can package and deploy the tool.
  • stdout logfiles are written one file per job run. (T161155)
  • Devs can run jobs one-off.
  • Runs jobs according to a code-generated crontab.
  • Never drop logs even (especially!) if the process is killed unexpectedly. (T161571)
  • Failmail when job exits with non-zero return code--let not perfect be thine enemy.
  • Nobody can accidentally run the script as their own user.
  • Working workaround for specific chained jobs. (T161035)


  • Jobs configuration is sync'ed to /var/lib/process-control with localsettings. Read-only.
  • Global configuration file is synced to /etc/process-control.yaml
  • Devs have sudo access to the scripts and can pass any CLI params.
  • jenkins g+ws /var/log/process-control
  • cron-generate can somehow write to /etc/cron.d/process-control

Not in MVP scope

  • Devs can kill jobs.
  • Log actions and errors to syslog. Echo to console when os.isatty()
  • script to list all jobs and statuses (T161584)
  • should be able to disable groups of jobs (T160699)
  • repeated failure handling (T161567)
  • Turn process-control lock module into a context manager (T161536)
  • Clean up deb packaging once we're on Jessie.
  • 100% test coverage coziness.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
awight updated the task description. (Show Details)

@Jgreen @cwdent
I noticed that we're only provisioning to the new CRM server and not the old one. Wasn't the plan to migrate jobs on the old server first, to minimize changes as we go? Sorry if we already discussed and I was talked out of this.

We can abandon that approach at any point, if it looks like backports stuff will cause extra work...

job files are being provisioned as group www-data, 640, but I don't see why the webservers should be able to read these. We could the jenkins service group just as well.

I'm not sure why the job files would be 640 yet the /etc config would be 555... looks like a default puppet mode?

awight triaged this task as High priority.Mar 28 2017, 5:15 AM

We packaged it for Precise and puppetflung it to barium last week, is
something absent?

When I looked last night, there was no /srv/p-c...

Ah, you're right--I just hadn't rsyncblastered it. It's there now.

awight renamed this task from [Epic] Basic process-control good enough to run all CRM jobs to [Epic] Write basic process-control, something good enough to run all CRM jobs..Apr 4 2017, 12:31 AM

I checked-off "cron-generate can somehow write to /etc/cron.d/process-control" -- we have a sudo wrapper /usr/local/bin/cron-generate puppetized, which uses sudo to runs /usr/bin/cron-generate as root. Also rsync_blaster will optionally trigger this from the deployment host after syncing changes.

Great, thanks! I've verified that it works.

awight claimed this task.
awight updated the task description. (Show Details)

Marking this task as done. Next step is to convert and test all the jobs.