Page MenuHomePhabricator

Add a second Gerrit connection in Zuul config
Closed, DeclinedPublic

Description

Our production zuul would need to be attached to a second Gerrit in order to validate in production the Gerrit 3.x upgrade (T200739).

In puppet modules/zuul/templates/zuul.conf.erb we would
replace the [gerrit] section with something such as:

zuul.conf
[connection gerrit-legacy]
driver=gerrit
user=jenkins-bot
baseurl=https://gerrit.wikimedia.org/r
sshkey=/var/lib/zuul/.ssh/id_rsa
event_delay=10

[connection gerrit-new]
driver=gerrit
user= <XXX???XXX>
baseurl= <XXX???XXX>
sshkey=/var/lib/zuul/.ssh/id_rsa
event_delay=10

Event Timeline

Gerrit 2.16 should work ( i tested it a long while back ).

But you would need to confirm with Gerrit 3.1.

@QChris @hashar Do you see a need for a DNS name like "gerrit-new.wikimedia.org" or someting similar which then can be used for the baseurl in this example and other things?

@QChris @hashar Do you see a need for a DNS name like "gerrit-new.wikimedia.org" or someting similar which then can be used for the baseurl in this example and other things?

Great suggestion! We already have gerrit-test.wikimedia.org, which I think we could use here. It already points to gerrit1002 (the host we're testing the upgrade on).

@hashar: Note though, that the gerrit on this host is not expected to be running 24/7. So just you don't get zuul alerts only because the host's gerrit is down as we're trying to test a new gerrit.war.

Change 598057 had a related patch set uploaded (by Hashar; owner: Hashar):
[operations/puppet@production] zuul: use modern [connection] section in config

https://gerrit.wikimedia.org/r/598057

Change 598058 had a related patch set uploaded (by Hashar; owner: Hashar):
[operations/puppet@production] zuul: add a connection to gerrit-test.wikimedia.org

https://gerrit.wikimedia.org/r/598058

The changes above would attach the production zuul to gerrit-test.wikimedia.org. We would need to add some more configuration in integration/config zuul/layout.yaml to create some pipelines and apply them on some test projects.

hashar triaged this task as High priority.

Change 598057 merged by Dzahn:
[operations/puppet@production] zuul: use modern [connection] section in config

https://gerrit.wikimedia.org/r/598057

Change 599283 had a related patch set uploaded (by Hashar; owner: Hashar):
[operations/puppet@production] zuul: specify the driver for gerrit connection

https://gerrit.wikimedia.org/r/599283

Change 599283 merged by Dzahn:
[operations/puppet@production] zuul: specify the driver for gerrit connection

https://gerrit.wikimedia.org/r/599283

Change 598058 merged by Dzahn:
[operations/puppet@production] zuul: add a connection to gerrit-test.wikimedia.org

https://gerrit.wikimedia.org/r/598058

Mentioned in SAL (#wikimedia-operations) [2020-06-11T07:59:39Z] <hashar> Restarted Zuul on contint2001 for config change # T253263

I have restarted Zuul this morning and it is now listening for events from gerrit-test.wikimedia.org:

$ ssh -p 29418 gerrit-test.wikimedia.org gerrit show-connections -w
Session    User            Remote Host
--------------------------------------------------------------
cc70b311   jenkins-bot     contint2001.wikimedia.org
2c394f47   hashar          bast1002.wikimedia.org

Once deployed, the zuul-merger misbehave cause it always fetch from 'origin' but never set the URL. Leading to:

2020-06-17 15:24:45,553 DEBUG zuul.Merger: Merging for change 605608,1.
2020-06-17 15:24:45,553 DEBUG zuul.Merger: Processing refspec refs/changes/08/605608/1 for project test/gerrit-ping / master ref Z869518357c7a46f9a2f42574d1461579
2020-06-17 15:24:45,556 DEBUG zuul.Merger: Unable to find commit for ref master/Z869518357c7a46f9a2f42574d1461579
2020-06-17 15:24:45,556 DEBUG zuul.Merger: No base commit found for (u'test/gerrit-ping', u'master')
2020-06-17 15:24:45,556 DEBUG zuul.Repo: Resetting repository /srv/zuul/git/test/gerrit-ping
2020-06-17 15:24:45,557 DEBUG zuul.Repo: Updating repository /srv/zuul/git/test/gerrit-ping
2020-06-17 15:24:46,419 DEBUG zuul.Repo: Checking out c7cedd024b444219d4fc63a5210534dbec2771bb
2020-06-17 15:24:47,156 DEBUG zuul.Merger: Unable to merge {u'oldrev': None, u'newrev': None, u'refspec': u'refs/changes/08/605608/1', u'merge_mode': 2, u'connection_name': u'gerrit_test', u'number': u'605608', u'project': u'test/gerrit-ping', u'url': u'ssh://jenkins-bot@gerrit-test.wikimedia.org:29418/test/gerrit-ping', u'branch': u'master', u'patchset': 1, u'ref': u'Z869518357c7a46f9a2f42574d1461579'}
Traceback (most recent call last):
  File "/srv/deployment/zuul/venv/local/lib/python2.7/site-packages/zuul/merger/merger.py", line 277, in _mergeChange
    commit = repo.merge(item['refspec'], 'resolve')
  File "/srv/deployment/zuul/venv/local/lib/python2.7/site-packages/zuul/merger/merger.py", line 163, in merge
    self.fetch(ref)
  File "/srv/deployment/zuul/venv/local/lib/python2.7/site-packages/zuul/merger/merger.py", line 176, in fetch
    origin.fetch(ref)
  File "/srv/deployment/zuul/venv/local/lib/python2.7/site-packages/git/remote.py", line 789, in fetch
    res = self._get_fetch_info_from_stderr(proc, progress)
  File "/srv/deployment/zuul/venv/local/lib/python2.7/site-packages/git/remote.py", line 675, in _get_fetch_info_from_stderr
    proc.wait(stderr=stderr_text)
  File "/srv/deployment/zuul/venv/local/lib/python2.7/site-packages/git/cmd.py", line 415, in wait
    raise GitCommandError(self.args, status, errstr)
GitCommandError: Cmd('git') failed due to: exit code(128)
  cmdline: git fetch -v origin refs/changes/08/605608/1
  stderr: 'fatal: Couldn't find remote ref refs/changes/08/605608/1'

That got fixed upstream 054eccc2fc07c2368fbd9dc20dde16c4eaa0639b

Change 606226 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/zuul@patch-queue/debian/jessie-wikimedia] WMF: backport Set remote url on every getRepo in merger

https://gerrit.wikimedia.org/r/606226

Change 606226 merged by Hashar:
[integration/zuul@patch-queue/debian/jessie-wikimedia] WMF: backport Set remote url on every getRepo in merger

https://gerrit.wikimedia.org/r/606226

Change 606243 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/zuul/deploy@master] backport Set remote url on every getRepo in merger

https://gerrit.wikimedia.org/r/606243

Change 606243 merged by Hashar:
[integration/zuul/deploy@master] backport Set remote url on every getRepo in merger

https://gerrit.wikimedia.org/r/606243

I have deployed the change to contint1001 which is only running the zuul-merger process. I stopped the zuul-merger on contint2001. Then a recheck of https://gerrit-test.wikimedia.org/r/c/test/gerrit-ping/+/605608 leads to:

contint1001:~$ git -C  /srv/zuul/git/test/gerrit-ping remote -v
origin	ssh://jenkins-bot@gerrit-test.wikimedia.org:29418/test/gerrit-ping (fetch)
origin	ssh://jenkins-bot@gerrit-test.wikimedia.org:29418/test/gerrit-ping (push)

So the patch is working. It further fails on:

2020-06-17 19:16:55,097 ERROR zuul.Merger: Unable to reset repo <zuul.merger.merger.Repo object at 0x7f90381968d0>
Traceback (most recent call last):
  File "/srv/deployment/zuul/venv/local/lib/python2.7/site-packages/zuul/merger/merger.py", line 342, in _mergeItem
    repo.reset()
  File "/srv/deployment/zuul/venv/local/lib/python2.7/site-packages/zuul/merger/merger.py", line 108, in reset
    self.update()
  File "/srv/deployment/zuul/venv/local/lib/python2.7/site-packages/zuul/merger/merger.py", line 214, in update
    origin.fetch(tags=True, force=True)
  File "/srv/deployment/zuul/venv/local/lib/python2.7/site-packages/git/remote.py", line 789, in fetch
    res = self._get_fetch_info_from_stderr(proc, progress)
  File "/srv/deployment/zuul/venv/local/lib/python2.7/site-packages/git/remote.py", line 675, in _get_fetch_info_from_stderr
    proc.wait(stderr=stderr_text)
  File "/srv/deployment/zuul/venv/local/lib/python2.7/site-packages/git/cmd.py", line 415, in wait
    raise GitCommandError(self.args, status, errstr)
GitCommandError: Cmd('git') failed due to: exit code(128)
  cmdline: git fetch --force --tags -v origin
  stderr: 'fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.'
$ git -C  /srv/zuul/git/test/gerrit-ping fetch
The authenticity of host '[gerrit-test.wikimedia.org]:29418 ([2620:0:861:3:208:80:154:78]:29418)' can't be established.
RSA key fingerprint is SHA256:j7HQoQ6fIuEgDHjONjI2CZ+2Iwxqgo2Ur5LbPqBgxOU.
Are you sure you want to continue connecting (yes/no)? 

Change 606249 had a related patch set uploaded (by Hashar; owner: Hashar):
[operations/puppet@production] zuul: add gerrit-test:29418 as a ssh known host

https://gerrit.wikimedia.org/r/606249

Change 606249 merged by CDanis:
[operations/puppet@production] zuul: add gerrit-test:29418 as a ssh known host

https://gerrit.wikimedia.org/r/606249

2020-06-17 19:56:22,557 DEBUG zuul.Repo: Updating repository /srv/zuul/git/test/gerrit-ping
2020-06-17 19:56:22,794 DEBUG zuul.Repo: Checking out 96f3e4b11b69e13fa6ca201ad5fceaf304760bdb
2020-06-17 19:56:22,982 DEBUG zuul.Repo: Merging refs/changes/08/605608/1 with args ['-s', 'resolve', 'FETCH_HEAD']
2020-06-17 19:56:22,998 DEBUG zuul.Repo: Set remote url to None
2020-06-17 19:56:23,011 DEBUG zuul.Repo: CreateZuulRef master/Z25df9a852a064afabd1d73b322bf79b4 at 96f3e4b11b69e13fa6ca201ad5fceaf304760bdb on <git.Repo "/srv/zuul/git/test/gerrit-ping/.git">

GRmblblb the url is set to None :-(

Change 606252 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/zuul/deploy@master] Revert "backport Set remote url on every getRepo in merger"

https://gerrit.wikimedia.org/r/606252

Change 606252 merged by Hashar:
[integration/zuul/deploy@master] Revert "backport Set remote url on every getRepo in merger"

https://gerrit.wikimedia.org/r/606252

Change 606259 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/zuul@patch-queue/debian/jessie-wikimedia] Revert "WMF: backport Set remote url on every getRepo in merger"

https://gerrit.wikimedia.org/r/606259

Change 606259 merged by Hashar:
[integration/zuul@patch-queue/debian/jessie-wikimedia] Revert "WMF: backport Set remote url on every getRepo in merger"

https://gerrit.wikimedia.org/r/606259

So for a change the scheduler ask for a merge and it has the proper url

2020-06-17 20:06:56,681 DEBUG zuul.MergeClient: Submitting job <gear.Job 0x7f8a343d04d0 handle: None name: merger:merge unique: 77d82f82eed746209096bce6ca152182> with data {'items': [{'oldrev': None, 'newrev': None, 'refspec': u'refs/changes/54/606254/1', 'merge_mode': 2, 'number': '606254', 'connection_name': 'gerrit', 'project': 'mediawiki/core', 'url': 'ssh://jenkins-bot@gerrit.wikimedia.org:29418/mediawiki/core', 'branch': u'master', 'patchset': 1, 'ref': 'Zdec91b053d484b3790561b3fd94e4a2c'}]}

On the merger:

DEBUG zuul.Merger: Merging for change 606254,1.
DEBUG zuul.Merger: Processing refspec refs/changes/54/606254/1 for project mediawiki/core / master ref Zdec91b053d484b3790561b3fd94e4a2c
DEBUG zuul.Repo: Set remote url to ssh://jenkins-bot@gerrit.wikimedia.org:29418/mediawiki/core

The remote is set basd on the item url. That is good and it is what I had tested I guess. Then it continues:

DEBUG zuul.Merger: Unable to find commit for ref master/Zdec91b053d484b3790561b3fd94e4a2c
DEBUG zuul.Merger: No base commit found for (u'mediawiki/core', u'master')
DEBUG zuul.Repo: Resetting repository /srv/zuul/git/mediawiki/core
DEBUG zuul.Repo: Updating repository /srv/zuul/git/mediawiki/core
DEBUG zuul.Repo: Checking out f47014c996a4b3df32d4dc95f238e8ce443d1fca
DEBUG zuul.Repo: Merging refs/changes/54/606254/1 with args ['-s', 'resolve', 'FETCH_HEAD']
DEBUG zuul.Repo: Set remote url to None
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^---- OH NO
DEBUG zuul.Repo: CreateZuulRef master/Zdec91b053d484b3790561b3fd94e4a2c at 3980a34aad53cf37722aeed82bf10a99629e4b01 on <git.Repo "/srv/zuul/git/mediawiki/core/.git">

The repository ends up with a remote pointing to None...

Another issue I have witnessed is https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/Wikibase/+/578298/ getting some event but it shows a connection and url of gerrit_test which is wrong:

2020-06-17 20:01:06,670 DEBUG zuul.MergeClient: Submitting job <gear.Job 0x7f8a3434a810 handle: None name: merger:merge unique: 7496e4f81f3049ef9816ad22c8530a5e> with data {'items': [{'oldrev': None, 'newrev': None, 'refspec': u'refs/changes/98/578298/4', 'merge_mode': 2, 'number': '578298', 'connection_name': 'gerrit_test', 'project': 'mediawiki/extensions/Wikibase', 'url': 'ssh://jenkins-bot@gerrit-test.wikimedia.org:29418/mediawiki/extensions/Wikibase', 'branch': u'master', 'patchset': 4, 'ref': 'Z97a233b41ef548e1824cb250017cf966'}]}

Though apparently the zuul-merger managed to fetch it???

2020-06-17 20:01:08,996 DEBUG zuul.Merger: Processing refspec refs/changes/98/578298/4 for project mediawiki/extensions/Wikibase / master ref Z97a233b41ef548e1824cb250017cf966
2020-06-17 20:01:08,996 DEBUG zuul.Repo: Set remote url to ssh://jenkins-bot@gerrit-test.wikimedia.org:29418/mediawiki/extensions/Wikibase
...
2020-06-17 20:01:11,217 DEBUG zuul.Repo: Merging refs/changes/98/578298/4 with args ['-s', 'resolve', 'FETCH_HEAD']
2020-06-17 20:01:11,387 DEBUG zuul.Repo: Set remote url to None
2020-06-17 20:01:11,391 DEBUG zuul.Repo: CreateZuulRef master/Z97a233b41ef548e1824cb250017cf966 at 68bedde07fbe29a38f11e4c314c296a7e6021904 on <git.Repo "/srv/zuul/git/mediawiki/extensions/Wikibase/.git">

So something very funky is happening for some reason :-\

Change 606269 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/config@master] Revert "layout: Add basic pipelines for gerrit-test.wikimedia.org"

https://gerrit.wikimedia.org/r/606269

Change 606269 merged by jenkins-bot:
[integration/config@master] Revert "layout: Add basic pipelines for gerrit-test.wikimedia.org"

https://gerrit.wikimedia.org/r/606269

Yet again I have been too ambitious :-\ Turns out Zuul scheduler is confused when it receives an event for a change number which is in the two Gerrit. There must have been some patches in the scheduler to fix that but it is not in our version of Zuul.

Instead, I guess I should set up a dedicated zuul instance targeting solely gerrit-test.

Somewhat related is a patch to properly filter events when having two Gerrit https://review.opendev.org/c/zuul/zuul/+/760907 (Zuul 4.3.0).

- pipeline:
    name: check
    trigger:
      gerrit-org-1:
        - event: patchset-created
          branch: 'master'
      gerrit-org-2:
        - event: patchset-created
          branch: 'develop'

Which implies what I tried to achieves is working nowadays.