Page MenuHomePhabricator

Scap reports a check failure as an SSH connection failure
Closed, ResolvedPublic

Description

During the deployment of RESTBase today, the deploy failed because one of the checks did not pass. However, instead of reporting a check failure, Scap dumped the SSH verbose log to the screen:

restbase/deploy: config_deploy stage(s): 100% (ok: 1; fail: 0; left: 0)         
19:28:38 Running remote deploy cmd ['/usr/bin/scap', 'deploy-local', '-v', '--repo', 'restbase/deploy', '--force', '-g', 'canary', 'promote', '--refresh-config']
19:28:38 Using key: /etc/keyholder.d/deploy_service.pub
19:29:33 ['/usr/bin/scap', 'deploy-local', '-v', '--repo', 'restbase/deploy', '--force', '-g', 'canary', 'promote', '--refresh-config'] on restbase1011.eqiad.wmnet returned [1]: OpenSSH_6.7p1 Debian-5+deb8u3, OpenSSL 1.0.2m  2 Nov 2017
debug1: Reading configuration data /dev/null
debug1: Connecting to restbase1011.eqiad.wmnet [10.64.0.113] port 22.
debug1: Connection established.
debug1: identity file /etc/keyholder.d/deploy_service.pub type 1
debug1: key_load_public: No such file or directory
debug1: identity file /etc/keyholder.d/deploy_service.pub-cert type -1
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_6.7p1 Debian-5+deb8u3
debug1: Remote protocol version 2.0, remote software version OpenSSH_6.7p1 Debian-5+deb8u3
debug1: match: OpenSSH_6.7p1 Debian-5+deb8u3 pat OpenSSH* compat 0x04000000
debug1: SSH2_MSG_KEXINIT sent
debug1: SSH2_MSG_KEXINIT received
debug1: kex: server->client aes128-ctr umac-128-etm@openssh.com none
debug1: kex: client->server aes128-ctr umac-128-etm@openssh.com none
debug1: sending SSH2_MSG_KEX_ECDH_INIT
debug1: expecting SSH2_MSG_KEX_ECDH_REPLY
debug1: Server host key: ECDSA 69:2e:30:b5:30:da:9d:21:a3:a4:ec:bf:c6:df:01:28
debug1: Host 'restbase1011.eqiad.wmnet' is known and matches the ECDSA host key.
debug1: Found key in /etc/ssh/ssh_known_hosts:1190
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug1: SSH2_MSG_NEWKEYS received
debug1: SSH2_MSG_SERVICE_REQUEST sent
debug1: SSH2_MSG_SERVICE_ACCEPT received
debug1: Authentications that can continue: publickey,keyboard-interactive
debug1: Next authentication method: publickey
debug1: Offering RSA public key: /etc/keyholder.d/deploy_service.pub
debug1: Server accepts key: pkalg ssh-rsa blen 279
debug1: Authentication succeeded (publickey).
Authenticated to restbase1011.eqiad.wmnet ([10.64.0.113]:22).
debug1: channel 0: new [client-session]
debug1: Requesting no-more-sessions@openssh.com
debug1: Entering interactive session.
debug1: Sending command: /usr/bin/scap deploy-local -v --repo restbase/deploy --force -g canary promote --refresh-config
debug1: client_input_channel_req: channel 0 rtype exit-status reply 0
debug1: client_input_channel_req: channel 0 rtype eow@openssh.com reply 0
debug1: channel 0: free: client-session, nchannels 1
debug1: fd 1 clearing O_NONBLOCK
Transferred: sent 3428, received 8464 bytes, in 54.1 seconds
Bytes per second: sent 63.3, received 156.4
debug1: Exit status 1

restbase/deploy: promote and restart_service stage(s): 100% (ok: 0; fail: 1; left: 0)
19:29:33 1 targets had deploy errors
19:29:33 1 targets failed
19:29:33 1 of 2 canary1 targets failed, exceeding limit
Rollback all deployed groups? [Y/n]: n

Note that this log contains no indication whatsoever that the problem was a check that failed.

Event Timeline

mobrovac created this task.

Just had this same thing happen on T186532

That is the error reported to scap deploy was:

21:17:02 ['/usr/bin/scap', 'deploy-local', '-v', '--repo', 'integration/slave-scripts', '--force', '-g', 'default', 'fetch', '--refresh-config'] on deployment-mediawiki05
.deployment-prep.eqiad.wmflabs returned [70]: OpenSSH_6.7p1 Debian-5+deb8u4, OpenSSL 1.0.2n  7 Dec 2017                                                                   
debug1: Reading configuration data /dev/null                                                                                                                              
debug1: Connecting to deployment-mediawiki05.deployment-prep.eqiad.wmflabs [10.68.22.21] port 22.                                                                         
debug1: Connection established.                                                                                                                                           
debug1: identity file /etc/keyholder.d/mwdeploy.pub type 1                                                                                                                
debug1: key_load_public: No such file or directory                                                                                                                        
debug1: identity file /etc/keyholder.d/mwdeploy.pub-cert type -1                                                                                                          
debug1: Enabling compatibility mode for protocol 2.0                                                                                                                      
debug1: Local version string SSH-2.0-OpenSSH_6.7p1 Debian-5+deb8u4                                                                                                        
debug1: Remote protocol version 2.0, remote software version OpenSSH_6.7p1 Debian-5+deb8u3                                                                                
debug1: match: OpenSSH_6.7p1 Debian-5+deb8u3 pat OpenSSH* compat 0x04000000                                                                                               
debug1: SSH2_MSG_KEXINIT sent                                                                                                                                             
debug1: SSH2_MSG_KEXINIT received                                                                                                                                         
debug1: kex: server->client aes128-ctr umac-128-etm@openssh.com none                                                                                                      
debug1: kex: client->server aes128-ctr umac-128-etm@openssh.com none                                                                                                      
debug1: sending SSH2_MSG_KEX_ECDH_INIT                                                                                                                                    
debug1: expecting SSH2_MSG_KEX_ECDH_REPLY                                                                                                                                 
debug1: Server host key: ECDSA 2b:c7:c7:45:c0:02:a0:07:50:15:4c:c1:58:d4:56:3e                                                                                            
debug1: Host 'deployment-mediawiki05.deployment-prep.eqiad.wmflabs' is known and matches the ECDSA host key.                                                              
debug1: Found key in /home/thcipriani/.ssh/known_hosts:4                                                                                                                  
debug1: SSH2_MSG_NEWKEYS sent                                                                                                                                             
debug1: expecting SSH2_MSG_NEWKEYS                                                                                                                                        
debug1: SSH2_MSG_NEWKEYS received                                                                                                                                         
debug1: SSH2_MSG_SERVICE_REQUEST sent                                                                                                                                     
debug1: SSH2_MSG_SERVICE_ACCEPT received                                                                                                                                  
debug1: Authentications that can continue: publickey                                                                                                                      
debug1: Next authentication method: publickey                                                                                                                             
debug1: Offering RSA public key: /etc/keyholder.d/mwdeploy.pub                                                                                                            
debug1: Server accepts key: pkalg ssh-rsa blen 535                                                                                                                        
debug1: Authentication succeeded (publickey).                                                                                                                             
Authenticated to deployment-mediawiki05.deployment-prep.eqiad.wmflabs ([10.68.22.21]:22).                                                                                 
debug1: channel 0: new [client-session]                                                                                                                                   
debug1: Requesting no-more-sessions@openssh.com                                                                                                                           
debug1: Entering interactive session.                                                                                                                                     
debug1: Sending command: /usr/bin/scap deploy-local -v --repo integration/slave-scripts --force -g default fetch --refresh-config                                         
http://deployment-tin.deployment-prep.eqiad.wmflabs/integration/slave-scripts/.git                                                                                        
/srv/deployment/integration/slave-scripts-cache/cache                                                                                                                     
debug1: client_input_channel_req: channel 0 rtype exit-status reply 0                                                                                                     
debug1: client_input_channel_req: channel 0 rtype eow@openssh.com reply 0                                                                                                 
debug1: channel 0: free: client-session, nchannels 1                                                                                                                      
debug1: fd 1 clearing O_NONBLOCK                                                                                                                                          
Transferred: sent 4164, received 24780 bytes, in 1.8 seconds                                                                                                              
Bytes per second: sent 2359.1, received 14039.2         
debug1: Exit status 70

Which is completely useless information.

Looking at scap deploy-log -v I found the real error:

21:17:02 [deployment-mediawiki05.deployment-prep.eqiad.wmflabs] Unhandled error:
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/scap/cli.py", line 330, in run
    exit_status = app.main(app.extra_arguments)
  File "/usr/lib/python2.7/dist-packages/scap/deploy.py", line 146, in main
    getattr(self, stage)()
  File "/usr/lib/python2.7/dist-packages/scap/deploy.py", line 332, in fetch
    for manager in git_binary_manager:
TypeError: 'NoneType' object is not iterable
21:17:02 [deployment-mediawiki05.deployment-prep.eqiad.wmflabs] deploy-local failed: <TypeError> {}

Somewhere recently error logging was broken. Based on the timing of this ticket this broken somewhere in 3.7.4 (which was Dec 7th, the next release happened Jan 5th)

Here's what the changlog has for that release: https://github.com/wikimedia/scap/blob/release/debian/changelog#L37-L56

scap (3.7.4-1) unstable; urgency=medium

  * Disk usage: Cache submodules and use --reference to save space
    
  * Logging: Don't lie when sync-wikiversions actually failed

  * Premium Feature: New scap sayings

  * Internals: Use the `sh` library to wrap git commands.

  * Logging: Correct action for no-log-message

  * Disk usage: Add a config key `cache_revs` to specify how many revs to keep

  * Modernization: Migrate from nosetests to pytest, plus tons of changes for
    python 3.x compatibility.

  * Logging: pass `-v` to ssh when running `scap deploy -v`

 -- Mukunda Modell <mmodell@wikimedia.org>  Thu, 07 Dec 2017 11:35:58 -0600
thcipriani claimed this task.

Closing, D1030 went out with Scap 3.8.1-1 in early May.