Page MenuHomePhabricator

`rsync error` during sync-pull-masters
Closed, ResolvedPublic

Description

Scap complained today during train:

[zfilipin@deploy1001 ~]$ ./release/bin/deploy-promote
...
sync-pull-masters: 100% (ok: 1; fail: 0; left: 0)
13:14:08 Finished sync-pull-masters (duration: 00m 07s)
13:14:08 Started sync-check-canaries
13:14:19 ['/usr/bin/scap', 'pull', '--no-update-l10n', 'deploy2001.codfw.wmnet', 'deploy1001.eqiad.wmnet', 'deploy1001.eqiad.wmnet'] on mwdebug1001.eqiad.wmnet returned [70]: 13:14:09 Copying from deploy1001.eqiad.wmnet to mwdebug1001.eqiad.wmnet
13:14:09 Started rsync common
cannot delete non-empty directory: php-1.33.0-wmf.23/cache/l10n
cannot delete non-empty directory: php-1.33.0-wmf.23/cache/l10n
cannot delete non-empty directory: php-1.33.0-wmf.23/cache
cannot delete non-empty directory: php-1.33.0-wmf.23/cache
cannot delete non-empty directory: php-1.33.0-wmf.23
cannot delete non-empty directory: docroot/wwwportal/.well-known/matrix
cannot delete non-empty directory: docroot/wwwportal/.well-known
rsync: delete_file: unlink(docroot/wwwportal/.well-known/matrix/client) failed: Permission denied (13)
rsync: delete_file: rmdir(docroot/wwwportal/.well-known/matrix) failed: Permission denied (13)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1668) [generator=3.1.2]
13:14:19 Finished rsync common (duration: 00m 10s)
13:14:19 Unhandled error:
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/scap/cli.py", line 342, in run
    exit_status = app.main(app.extra_arguments)
  File "/usr/lib/python2.7/dist-packages/scap/main.py", line 715, in main
    rsync_args=rsync_args
  File "/usr/lib/python2.7/dist-packages/scap/utils.py", line 402, in context_wrapper
    return func(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/scap/tasks.py", line 401, in sync_common
    subprocess.check_call(rsync)
  File "/usr/lib/python2.7/subprocess.py", line 186, in check_call
    raise CalledProcessError(retcode, cmd)
CalledProcessError: Command '['sudo', '-u', 'mwdeploy', '-n', '--', '/usr/bin/rsync', '--archive', '--delete-delay', '--delay-updates', '--compress', '--delete', '--exclude=**/cache/l10n/*.cdb', '--exclude=*.swp', '--no-perms', '--exclude=**/.git', 'deploy1001.eqiad.wmnet::common', '/srv/mediawiki']' returned non-zero exit status 23
13:14:19 pull failed: <CalledProcessError> Command '['sudo', '-u', 'mwdeploy', '-n', '--', '/usr/bin/rsync', '--archive', '--delete-delay', '--delay-updates', '--compress', '--delete', '--exclude=**/cache/l10n/*.cdb', '--exclude=*.swp', '--no-perms', '--exclude=**/.git', 'deploy1001.eqiad.wmnet::common', '/srv/mediawiki']' returned non-zero exit status 23

check-canaries: 100% (ok: 10; fail: 1; left: 0)
13:15:04 1 canaries had sync errors
13:15:04 Finished Canaries Synced (duration: 00m 55s)
...

Scap finished, there were no problems.

Event Timeline

zeljkofilipin renamed this task from Scap to `rsync error` during sync-pull-masters.Jun 5 2019, 1:37 PM
zeljkofilipin triaged this task as High priority.
zeljkofilipin updated the task description. (Show Details)
zeljkofilipin updated the task description. (Show Details)
zeljkofilipin raised the priority of this task from High to Unbreak Now!.Jun 5 2019, 1:43 PM
zeljkofilipin added a subscriber: thcipriani.

@thcipriani do you know what went wrong? Is this blocking the train? If not, please decrease priority and remove from train blockers.

reedy@mwdebug1001:/srv/mediawiki$ sudo -u mwdeploy rm -rf php-1.33.0-wmf.23/
reedy@mwdebug1001:/srv/mediawiki$

@thcipriani do you know what went wrong? Is this blocking the train? If not, please decrease priority and remove from train blockers.

I don't think so, as it's just failing to cleanup/remove some dirs (which is an ongoing issue)

reedy@mwdebug1001:/srv/mediawiki$ ls -al docroot/wwwportal/
total 16
drwxr-xr-x  4 mwdeploy mwdeploy 4096 Mar 19  2018 .
drwxr-xr-x 10 mwdeploy mwdeploy 4096 Apr 15 15:43 ..
lrwxrwxrwx  1 mwdeploy mwdeploy   13 Mar 19  2018 portal -> ../../portals
lrwxrwxrwx  1 mwdeploy mwdeploy   21 Oct 18  2016 static -> /srv/mediawiki/static
drwxr-xr-x  2 mwdeploy mwdeploy 4096 Nov  7  2018 w
drwxr-xr-x  3 root     root     4096 Jun  4 18:28 .well-known
reedy@mwdebug1001:/srv/mediawiki$

Dunno why that was added. Will need a root to remove it

reedy@mwdebug1001:/srv/mediawiki$ ls -al docroot/wwwportal/
total 16
drwxr-xr-x  4 mwdeploy mwdeploy 4096 Mar 19  2018 .
drwxr-xr-x 10 mwdeploy mwdeploy 4096 Apr 15 15:43 ..
lrwxrwxrwx  1 mwdeploy mwdeploy   13 Mar 19  2018 portal -> ../../portals
lrwxrwxrwx  1 mwdeploy mwdeploy   21 Oct 18  2016 static -> /srv/mediawiki/static
drwxr-xr-x  2 mwdeploy mwdeploy 4096 Nov  7  2018 w
drwxr-xr-x  3 root     root     4096 Jun  4 18:28 .well-known
reedy@mwdebug1001:/srv/mediawiki$

Dunno why that was added. Will need a root to remove it

See T223835: Configure wikimedia.org to enable *:wikimedia.org Matrix user IDs for why it exists...

Reedy claimed this task.
reedy@mwdebug1001:/srv/mediawiki$ scap pull
13:53:40 Copying from deploy1001.eqiad.wmnet to mwdebug1001.eqiad.wmnet
13:53:40 Started rsync common
13:53:45 Finished rsync common (duration: 00m 05s)
13:53:45 Started scap-cdb-rebuild
13:53:45 Finished scap-cdb-rebuild (duration: 00m 00s)
reedy@mwdebug1001:/srv/mediawiki$