Page MenuHomePhabricator

Garbage collect orphan jobs workspaces on slaves
Closed, ResolvedPublic

Description

From https://www.mediawiki.org/wiki/Continuous_integration/meetings/2015-03-30/minutes

@Legoktm and @Krinkle unified a lot of jobs (ex: npm). We had the old jobs deleted on the master but the workspace remain on slaves.

Since MediaWiki extensions jobs have a full copy mediawiki/core (will get rid of it with T93703) we should get Jenkins to get rid of the old workspaces on slaves when a job is deleted.

There is apparently no support for it, so we will have to write some kind of garbage collector.


Event Timeline

hashar raised the priority of this task from to Needs Triage.
hashar updated the task description. (Show Details)
hashar added subscribers: hashar, Krinkle, Legoktm.
hashar set Security to None.

I already wrote a script to do this and ran it on the production slaves, but haven't on the labs ones. Shouldn't be too hard to do.

Awesome @Legoktm! Can you send the script in a git repo? integration/config.git /tools/ comes to mind.

Beware that workspace directory can be prefixed with '@n' where n > 2 is added by Jenkins to avoid workspace conflicts for a same job running concurrently on the same instance.

1#!/usr/bin/env python2
2from __future__ import unicode_literals, print_function
3
4# CC-0
5
6import os
7import shutil
8import sys
9
10delete = '--delete' in sys.argv
11
12with open('jobs.txt') as f:
13 jobs = set(f.read().splitlines())
14
15path = '/mnt/jenkins-workspace/workspace'
16#path = '/srv/ssd/jenkins-slave/workspace'
17for workspace in os.listdir(path):
18 if workspace.split('@', 1)[0] not in jobs:
19 full_path = os.path.join(path, workspace)
20 print(full_path)
21 if delete:
22 shutil.rmtree(full_path)

jobs.txt is a list of all the jobs that currently exist in jjb, generated by rm -rf output && jenkins-jobs test config -o output && ls output > jobs.txt

I ran it on all the labs slaves yesterday.

Krinkle moved this task from Untriaged to Backlog on the Continuous-Integration-Infrastructure board.

That is less of a concern now that most jobs are running on disposable instances. As such there is less need to garbage collect workspaces.