Page MenuHomePhabricator

PCC jobs running on compiler1001.puppet-diffs.eqiad.wmflabs fails because disk is full
Closed, ResolvedPublic

Description

Example:

https://integration.wikimedia.org/ci/view/operations/job/operations-puppet-catalog-compiler/28074/console

[...]
[ 2021-02-15T12:47:45 ] INFO: Refreshing the common repos from upstream if needed
[ 2021-02-15T12:47:45 ] INFO: Creating directories under /srv/jenkins-workspace/puppet-compiler
error: copy-fd: write returned: No space left on device
fatal: failed to copy file to '/srv/jenkins-workspace/puppet-compiler/28074/production/src/.git/objects/pack/pack-2cf2755860f0b94cd9facffbf93494821f189eb9.pack': No space left on device
[ 2021-02-15T12:47:46 ] CRITICAL: `git clone -q /var/lib/catalog-differ/production /srv/jenkins-workspace/puppet-compiler/28074/production/src` failed: Command '['git', 'clone', '-q', '/var/lib/catalog-differ/production', '/srv/jenkins-workspace/puppet-compiler/28074/production/src']' returned non-zero exit status 128.
Build step 'Execute shell' marked build as failure
Finished: FAILURE

Event Timeline

hashar claimed this task.
hashar added a subscriber: hashar.

Same happened on compiler1002 recently: T273599

I have dropped build directories from /srv/jenkins-workspace/puppet-compiler.

This is a bit of a known issue normally caused by users running PCC with an empty Hosts: entry which causes a pcc report of ~2.6GB. During normal operations there is a cron job which deletes files older then 30 days, however when a user (normally me) runs a lot of reports with an empty Hosts entry then the disc fills up. the simple fix is to check the output of the following command

du -hs /srv/jenkins-workspace/puppet-compiler/output/* | grep G

and delete the large directories, i normally delete all but the most recent one. ill look at maybe creating a timer to delete big folders older then 1 week?

Change 664585 had a related patch set uploaded (by Jbond; owner: John Bond):
[operations/puppet@production] P:puppet_compiler: add job to deletd large pcc reports after 7 days

https://gerrit.wikimedia.org/r/664585

Change 664585 merged by Jbond:
[operations/puppet@production] P:puppet_compiler: add job to deletd large pcc reports after 7 days

https://gerrit.wikimedia.org/r/664585