Page MenuHomePhabricator

Prune /srv/mediawiki/php-1.27.0-wmf.19
Closed, ResolvedPublic

Description

During a full scap, I got the following messages:

23:33:25 Started rsync common
cannot delete non-empty directory: php-1.27.0-wmf.19/cache/l10n
cannot delete non-empty directory: php-1.27.0-wmf.19/cache/l10n
cannot delete non-empty directory: php-1.27.0-wmf.19/cache
cannot delete non-empty directory: php-1.27.0-wmf.19/cache
cannot delete non-empty directory: php-1.27.0-wmf.19
23:34:31 Finished rsync common (duration: 01m 06s)

According the train reference, the following command should take care of that:
SSH_AUTH_SOCK=/run/keyholder/proxy.sock dsh -F 20 -M -g mediawiki-installation -r ssh -o -oUser=mwdeploy -- rm -rf /srv/mediawiki/php-1.27.0-wmf.19

@thcipriani noted this issue didn't appear before, and so is worth investigating.

Event Timeline

Might be my fault. When doing the deployment train I am reluctant to delete files because I dont quite understand the impact or how to rollback/recreate them in case of mistake.

The problem is the l10n directory is owned by l10nupdate but the parent directory is owned by some deployer without group write. It requires a sudo -u l10nupdate rm php-${version}/cache/l10n/* followed by rm -rf php-${version}/

The problem is the l10n directory is owned by l10nupdate but the parent directory is owned by some deployer without group write. It requires a sudo -u l10nupdate rm php-${version}/cache/l10n/* followed by rm -rf php-${version}/

On the remote side everything should be owned by mwdeploy.

This is a message from rsync that was caused because the php-${version}/cache/l10n directory is filled with cdb files that we exclude (https://github.com/wikimedia/scap/blob/master/scap/tasks.py#L37) so when we delete that directory locally, rsync refuses to delete the remote directory because there are things there. Here is a micro example of what's happening:

(•◡•)❥ mkdir -p original/one
tmp
(•◡•)❥ echo '1' > !$/one.txt
echo '1' > original/one/one.txt
tmp
(•◡•)❥ rsync -avz --delete --exclude='**/*.txt' original/ extra-crispy
sending incremental file list
created directory extra-crispy
./
one/

sent 92 bytes  received 58 bytes  300.00 bytes/sec
total size is 0  speedup is 0.00
tmp
(•◡•)❥ echo '1' > extra-crispy/one/one.txt
tmp
(•◡•)❥ rm -rf original/one
tmp
(•◡•)❥ rsync -avz --delete --exclude='**/*.txt' original/ extra-crispy
sending incremental file list
cannot delete non-empty directory: one
./

sent 57 bytes  received 62 bytes  238.00 bytes/sec
total size is 0  speedup is 0.00

I think we just need to remove the remote files via dsh:

SSH_AUTH_SOCK=/run/keyholder/proxy.sock dsh -F 20 -M -g mediawiki-installation -r ssh -oUser=mwdeploy -- rm -rf /srv/mediawiki/php-1.27.0-wmf.19

I don't have the slightest idea how long that will take for all...

thcipriani@tin:~
❯ grep -v '^#' /etc/dsh/group/mediawiki-installation | grep -v '^$' | uniq | sort | wc -l
411

app servers. Would like to do it in a deployment window.

thcipriani claimed this task.

Did this today during the train.

For future reference it should be fine to run anytime, it took less than a minute to run:

SSH_AUTH_SOCK=/run/keyholder/proxy.sock dsh -F 20 -M -g scap-masters -r ssh -o -oUser=mwdeploy -- rm -rf /srv/mediawiki/php-1.27.0-wmf.19
SSH_AUTH_SOCK=/run/keyholder/proxy.sock dsh -F 20 -M -g scap-proxies -r ssh -o -oUser=mwdeploy -- rm -rf /srv/mediawiki/php-1.27.0-wmf.19
SSH_AUTH_SOCK=/run/keyholder/proxy.sock dsh -F 20 -M -g mediawiki-installation -r ssh -o -oUser=mwdeploy -- rm -rf /srv/mediawiki/php-1.27.0-wmf.19