Page MenuHomePhabricator

cassandra/metrics-collector does not deploy with scap on a new install
Closed, ResolvedPublic

Description

During reimage of maps-test2004, deployment of metrics-collector fails with a reference to tin.eqiad.wmnet. Since this is a fresh install, I'm a bit puzzled as to where this reference comes from. See log below:

Error: Execution of '/usr/bin/scap deploy-local --repo cassandra/metrics-collector -D log_json:False' returned 70: 17:14:24 Using deprecated git_fat config, swap to git_binary_manager
17:14:24 Fetch from: http://tin.eqiad.wmnet/cassandra/metrics-collector/.git
17:16:34 Unhandled error:
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/scap/cli.py", line 336, in run
    exit_status = app.main(app.extra_arguments)
  File "/usr/lib/python2.7/dist-packages/scap/deploy.py", line 147, in main
    getattr(self, stage)()
  File "/usr/lib/python2.7/dist-packages/scap/deploy.py", line 290, in fetch
    git.fetch(self.context.cache_dir, git_remote)
  File "/usr/lib/python2.7/dist-packages/scap/git.py", line 374, in fetch
    git.clone(*cmd)
  File "/usr/lib/python2.7/dist-packages/scap/sh.py", line 1428, in __call__
    return RunningCommand(cmd, call_args, stdin, stdout, stderr)
  File "/usr/lib/python2.7/dist-packages/scap/sh.py", line 775, in __init__
    self.wait()
  File "/usr/lib/python2.7/dist-packages/scap/sh.py", line 793, in wait
    self.handle_command_exit_code(exit_code)
  File "/usr/lib/python2.7/dist-packages/scap/sh.py", line 816, in handle_command_exit_code
    raise exc
ErrorReturnCode_128: 

  RAN: /usr/bin/git clone --jobs 10 http://tin.eqiad.wmnet/cassandra/metrics-collector/.git /srv/deployment/cassandra/metrics-collector-cache/cache

  STDOUT:


  STDERR:
Cloning into '/srv/deployment/cassandra/metrics-collector-cache/cache'...
fatal: unable to access 'http://tin.eqiad.wmnet/cassandra/metrics-collector/.git/': Failed to connect to tin.eqiad.wmnet port 80: Connection timed out

17:16:34 deploy-local failed: <ErrorReturnCode_128> 

  RAN: /usr/bin/git clone --jobs 10 http://tin.eqiad.wmnet/cassandra/metrics-collector/.git /srv/deployment/cassandra/metrics-collector-cache/cache

  STDOUT:


  STDERR:
Cloning into '/srv/deployment/cassandra/metrics-collector-cache/cache'...
fatal: unable to access 'http://tin.eqiad.wmnet/cassandra/metrics-collector/.git/': Failed to connect to tin.eqiad.wmnet port 80: Connection timed out


Error: /Stage[main]/Cassandra::Metrics/Scap::Target[cassandra/metrics-collector]/Package[cassandra/metrics-collector]/ensure: change from absent to present failed: Execution of '/usr/bin/scap deploy-local --repo cassandra/metrics-collector -D log_json:False' returned 70: 17:14:24 Using deprecated git_fat config, swap to git_binary_manager
17:14:24 Fetch from: http://tin.eqiad.wmnet/cassandra/metrics-collector/.git
17:16:34 Unhandled error:
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/scap/cli.py", line 336, in run
    exit_status = app.main(app.extra_arguments)
  File "/usr/lib/python2.7/dist-packages/scap/deploy.py", line 147, in main
    getattr(self, stage)()
  File "/usr/lib/python2.7/dist-packages/scap/deploy.py", line 290, in fetch
    git.fetch(self.context.cache_dir, git_remote)
  File "/usr/lib/python2.7/dist-packages/scap/git.py", line 374, in fetch
    git.clone(*cmd)
  File "/usr/lib/python2.7/dist-packages/scap/sh.py", line 1428, in __call__
    return RunningCommand(cmd, call_args, stdin, stdout, stderr)
  File "/usr/lib/python2.7/dist-packages/scap/sh.py", line 775, in __init__
    self.wait()
  File "/usr/lib/python2.7/dist-packages/scap/sh.py", line 793, in wait
    self.handle_command_exit_code(exit_code)
  File "/usr/lib/python2.7/dist-packages/scap/sh.py", line 816, in handle_command_exit_code
    raise exc
ErrorReturnCode_128: 

  RAN: /usr/bin/git clone --jobs 10 http://tin.eqiad.wmnet/cassandra/metrics-collector/.git /srv/deployment/cassandra/metrics-collector-cache/cache

  STDOUT:


  STDERR:
Cloning into '/srv/deployment/cassandra/metrics-collector-cache/cache'...
fatal: unable to access 'http://tin.eqiad.wmnet/cassandra/metrics-collector/.git/': Failed to connect to tin.eqiad.wmnet port 80: Connection timed out

17:16:34 deploy-local failed: <ErrorReturnCode_128> 

  RAN: /usr/bin/git clone --jobs 10 http://tin.eqiad.wmnet/cassandra/metrics-collector/.git /srv/deployment/cassandra/metrics-collector-cache/cache

  STDOUT:


  STDERR:
Cloning into '/srv/deployment/cassandra/metrics-collector-cache/cache'...
fatal: unable to access 'http://tin.eqiad.wmnet/cassandra/metrics-collector/.git/': Failed to connect to tin.eqiad.wmnet port 80: Connection timed out

Event Timeline

Gehel created this task.Jun 13 2018, 5:21 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 13 2018, 5:21 PM
Gehel added a comment.Jun 13 2018, 5:47 PM

Editing /srv/deployment/cassandra/metrics-collector-cache/.config to replace the reference to tin with a ref to deploy1001 seems to fix the issue. But since this is a fresh reimage, that wrong config came from somewhere else, which also needs to be fixed.

Gehel added a comment.Jun 14 2018, 6:11 AM

Looking on deploy1001, I see that /srv/deployment/cassandra/metrics-collector/.git/DEPLOY_HEAD also has a reference to tin.eqiad.wmnet. I suppose I should correct it there. Could anyone confirm that editing that file is safe?

Joe added a subscriber: Joe.Jun 14 2018, 10:36 AM

Yes, giving a simple scap deploy --init in the directory on deployment1001 was enough to fix this.

Joe closed this task as Resolved.Jun 14 2018, 10:37 AM
Joe claimed this task.
Vvjjkkii renamed this task from cassandra/metrics-collector does not deploy with scap on a new install to o2aaaaaaaa.Jul 1 2018, 1:04 AM
Vvjjkkii reopened this task as Open.
Vvjjkkii removed Joe as the assignee of this task.
Vvjjkkii triaged this task as High priority.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed a subscriber: Aklapper.
mobrovac renamed this task from o2aaaaaaaa to cassandra/metrics-collector does not deploy with scap on a new install.Jul 1 2018, 11:33 AM
mobrovac closed this task as Resolved.
mobrovac assigned this task to Joe.
mobrovac lowered the priority of this task from High to Medium.
mobrovac updated the task description. (Show Details)
mobrovac removed a subscriber: Joe.