git deploy sync stays at 0/2.
Description
Event Timeline
There were 2 salt-minion processes running on sca01; I killed both and then started the service back up. I also restarted the salt-minion process on sca02 for good measure.
Updating with direct salt calls on sca0[12] seems to work:
bd808@deployment-sca01:~$ sudo salt-call deploy.fetch 'graphoid/deploy' [68/239] [INFO ] Executing command '/usr/bin/git fetch' in directory '/srv/deployment/ graphoid/deploy' [INFO ] Executing command '/usr/bin/git fetch --tags' in directory '/srv/depl oyment/graphoid/deploy' [INFO ] Executing command '/usr/bin/git submodule status --quiet' in director y '/srv/deployment/graphoid/deploy' [INFO ] Executing command '/usr/bin/git checkout .gitmodules' in directory '/ srv/deployment/graphoid/deploy' [INFO ] Executing command '/usr/bin/git config remote.origin.url' in director y '/srv/deployment/graphoid/deploy' [INFO ] Executing command '/usr/bin/git submodule sync' in directory '/srv/de ployment/graphoid/deploy' [INFO ] Executing command '/usr/bin/git submodule foreach --recursive git fet ch' in directory '/srv/deployment/graphoid/deploy' [INFO ] Executing command '/usr/bin/git submodule foreach --recursive git fet ch --tags' in directory '/srv/deployment/graphoid/deploy' [INFO ] Executing command '/usr/bin/git show-ref refs/tags/graphoid/deploy-sy nc-20151127-170244' in directory '/srv/deployment/graphoid/deploy' local: ---------- dependencies: repo: graphoid/deploy status: 0 tag: graphoid/deploy-sync-20151127-170244 bd808@deployment-sca01:~$ sudo salt-call deploy.checkout 'graphoid/deplo[40/239] [INFO ] Executing command '/usr/bin/git describe --always --tag' in directory '/srv/deployment/graphoid/deploy' [INFO ] Executing command u'/usr/bin/git checkout --force --quiet tags/grapho id/deploy-sync-20151127-170244' in directory '/srv/deployment/graphoid/deploy' [INFO ] Executing command '/usr/bin/git submodule status --quiet' in director y '/srv/deployment/graphoid/deploy' [INFO ] Executing command '/usr/bin/git checkout .gitmodules' in directory '/ srv/deployment/graphoid/deploy' [INFO ] Executing command '/usr/bin/git config remote.origin.url' in director y '/srv/deployment/graphoid/deploy' [INFO ] Executing command '/usr/bin/git submodule sync' in directory '/srv/de ployment/graphoid/deploy' [INFO ] Executing command '/usr/bin/git submodule update --recursive --init' in directory '/srv/deployment/graphoid/deploy' local: ---------- dependencies: repo: graphoid/deploy status: 0 tag: graphoid/deploy-sync-20151127-170244 bd808@deployment-sca01:~$ cd /srv/deployment/graphoid/deploy/ bd808@deployment-sca01:/srv/deployment/graphoid/deploy$ git log commit 54f25d526597d6093299c35cb00861e9adecc015 Author: Yuri Astrakhan <yurik@wikimedia.org> Date: Fri Nov 27 19:33:59 2015 +0300 Update graphoid to 45d31c1 List of changes: 20db530 Added deploy.remote option docs and bumped service-runner version a1a347c Added docs for node version setting 06aceb0 Fixed language in docs 06959fd Better wording of docs for deploy.remote option 15ee1e8 Docs: Deploy: Ensure latest source dependencies d30a2b6 CSP Headers: Allow the configuration of sending the security headers 45d31c1 Use graph api and new Vega api xxxxxxx Update node module dependencies Change-Id: I80efec8a6c2d2a472ccad38bdadd67dd3345939e
I tried a no-op deploy from deployment-bastion and it still did not work according to git deploy on deployment-bastion but I can see the new tag on both of the sca0[12] hosts. This makes me think the problem is that the returners on sca0[12] are failing to write back to the redis instance on deployment-bastion.
I tried testing with the test/testrepos Trebuchet target and got similar results. Trebuchet reported that all fetches and syncs failed to the minions deployment-db1 and deployment-mathoid, but manually checking /srv/deployment/test/testrepo on those servers shows that the new deploy tag (test/testrepo-sync-20151127-192854) is present and synced.
Deployment-bastion seems to have ferm enabled but I can also see that it has accept rules for redis:
$ sudo iptables -L -n | grep 6379 ACCEPT tcp -- 10.0.0.0/8 0.0.0.0/0 tcp dpt:6379 ACCEPT tcp -- 208.80.154.136 0.0.0.0/0 tcp dpt:6379
I know that there has been some Puppet changes related to redis in the last few days, so there may have been some little change that is making Trebuchet sad.
It works now, should I close it? @bd808 , is there something you are still tracking with this?
This was a bug introduced because of a new timeout feature of trebuchet. Syncing didn't work in prod either. It was fixed yesterday, so closing it.