Page MenuHomePhabricator

publishing to doc.wikimedia.org fails: rsync command not found
Closed, ResolvedPublic

Description

Spotted on various builds such as https://integration.wikimedia.org/ci/job/publish-to-doc/6767/console

08:15:53 Fetching from:
08:15:53 - Instance...: 172.16.2.93
08:15:53 - Workspace..: /srv/jenkins/workspace/workspace/mwext-phpunit-coverage-docker-publish
08:15:53 - Subdir.....: cover
08:15:53 + rsync --archive --stats --compress '--rsh=/usr/bin/ssh -a -T -o ConnectTimeout=6 -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no' jenkins-deploy@172.16.2.93:/srv/jenkins/workspace/workspace/mwext-phpunit-coverage-docker-publish/cover/. .
08:15:53 Warning: Permanently added '172.16.2.93' (ECDSA) to the list of known hosts.
08:15:54 bash: line 1: rsync: command not found
08:15:54 rsync: connection unexpectedly closed (0 bytes received so far) [Receiver]
08:15:54 rsync error: error in rsync protocol data stream (code 12) at io.c(235) [Receiver=3.1.3]
08:15:54 Build step 'Execute shell' marked build as failure

MediaWiki core documentation was last published on Jan 26 2022 19:12:04. That is an aftermath of migrating to Bullseye T252071

Event Timeline

hashar triaged this task as Unbreak Now! priority.Jan 27 2022, 8:40 AM
integration-cumin:~$ sudo cumin --force 'name:docker' 'which rsync'
18 hosts will be targeted:
integration-agent-docker-[1023-1039].integration.eqiad1.wikimedia.cloud,integration-agent-puppet-docker-1002.integration.eqiad1.wikimedia.cloud
FORCE mode enabled, continuing without confirmation
===== NODE GROUP =====                                                                        
(1) integration-agent-puppet-docker-1002.integration.eqiad1.wikimedia.cloud                   
----- OUTPUT of 'which rsync' -----                                                           
/usr/bin/rsync                                                                                
================                                                                              
PASS |██                                   |   6% (1/18) [00:00<00:13,  1.26hosts/s]          
FAIL |██████████████████████████████████  |  94% (17/18) [00:00<00:00, 21.10hosts/s]
94.4% (17/18) of nodes failed to execute command 'which rsync': integration-agent-docker-[1023-1039].integration.eqiad1.wikimedia.cloud
5.6% (1/18) success ratio (< 100.0% threshold) for command: 'which rsync'. Aborting.: integration-agent-puppet-docker-1002.integration.eqiad1.wikimedia.cloud
5.6% (1/18) success ratio (< 100.0% threshold) of nodes successfully executed all commands. Aborting.: integration-agent-puppet-docker-1002.integration.eqiad1.wikimedia.cloud

So rsync is gone somehow :-\\

From the host that works:

# aptitude why rsync
i   git-fat Depends rsync

However git-fat is no more available on Bullseye https://gerrit.wikimedia.org/r/c/operations/puppet/+/677807 / T275873 and thus as a side effect we no more have rsync. Regardless it should be marked as an explicit dependency.

Change 757613 had a related patch set uploaded (by Hashar; author: Hashar):

[operations/puppet@production] ci: ensure rsync is on all WMCS CI agents

https://gerrit.wikimedia.org/r/757613

Change 757613 merged by Muehlenhoff:

[operations/puppet@production] ci: ensure rsync is on all WMCS CI agents

https://gerrit.wikimedia.org/r/757613

Mentioned in SAL (#wikimedia-releng) [2022-01-27T09:16:54Z] <hashar> integration: cumin --force 'name:docker' 'apt install rsync' # T300236

From https://integration.wikimedia.org/ci/job/mediawiki-core-doxygen-docker/buildTimeTrend

rsync_has_vanished.png (442×951 px, 165 KB)

The build failed on the 1023+ hosts which are the new instances.

Missing docs will be regenerated as branches are updated.