metricsinfra-puppetserver-1.metricsinfra.eqiad1.wikimedia.cloud is failing to update its puppet repo because /srv is out of inodes. The immediate problem is pretty clear:
root@metricsinfra-puppetserver-1:~# ls /srv/puppet_code/environments_staging/ oot_branch_202405020945 oot_branch_202405021037 oot_branch_202405021119 oot_branch_202405021252 oot_branch_202405021426 oot_branch_202405021548 oot_branch_202405021701 oot_branch_202405020955 oot_branch_202405021047 oot_branch_202405021130 oot_branch_202405021303 oot_branch_202405021446 oot_branch_202405021630 oot_branch_202405021712 oot_branch_202405021016 oot_branch_202405021058 oot_branch_202405021150 oot_branch_202405021314 oot_branch_202405021507 oot_branch_202405021640 oot_branch_202405021743 oot_branch_202405021026 oot_branch_202405021108 oot_branch_202405021221 oot_branch_202405021355 oot_branch_202405021528 oot_branch_202405021651 production
This is probably an interaction between git-sync-upstream.py (which creates those oot branches for rebasing purposes) and puppetserver-deploy-code.sh (which invokes g10k, which copies everything into /srv/puppet_code). I suspect it's one or more of the following:
- g10k copies everything over blindly, meaning that if there are multiple code branches active in any deployed directory if piles them up in /srv/puppet_code and never cleans up.
- git-sync-upstream creates temporary branches as part of the merge process and likely triggers git hooks while managing that branch that in turn invoke puppetserver-deploy-code
- The puppetserver-deploy-code uses absolute paths and is invoked by git hooks. So git actions anywhere on a system results in a deployment from /srv/git/operations/puppet