Page MenuHomePhabricator

zuul-merger fails when repository names overlaps
Closed, DeclinedPublic

Description

When populating new zuul-merger, some merge operations will eventually fail for example:

GitCommandError: 'git clone -v ssh://jenkins-bot@gerrit.wikimedia.org:29418/operations/puppet /srv/zuul/git/operations/puppet' returned with exit code 128
stderr: 'fatal: destination path '/srv/zuul/git/operations/puppet' already exists and is not an empty directory.

In the above case, the repository operations/puppet/mariadb had a merge request handled. That creates the directory /srv/zuul/git/operations/puppet.

Later when a merge request is handled for operations/puppet git clone fails due to the path already existing.

Workaround

  1. delete the directory entirely and manually clone as zuul user or recheck till a merge job runs on that host
  1. better:

From T138455#2401076:

ssh scandium.eqiad.wmnet
sudo -H -u zuul bash -l
cd /srv/ssd/zuul/git/operations/software
git init .
git remote add origin ssh://jenkins-bot@gerrit.wikimedia.org:29418/operations/software
git remote set-head origin --auto

Fix up

zuul-merger should not just git-clone but be smarter and gracefully handle a directory that already exists.

Failure to set the symbolic ref for origin ( git remote set-head origin --auto ) causes:

2017-02-17 12:48:06,304 DEBUG zuul.Repo: Resetting repository /srv/zuul/git/operations/software
2017-02-17 12:48:06,305 DEBUG zuul.Repo: Updating repository /srv/zuul/git/operations/software
2017-02-17 12:48:07,192 ERROR zuul.Merger: Unable to reset repo <zuul.merger.merger.Repo object at 0x7f69dc6c57d0>
Traceback (most recent call last):
  File "/usr/share/python/zuul/local/lib/python2.7/site-packages/zuul/merger/merger.py", line 322, in _mergeItem
    repo.reset()
  File "/usr/share/python/zuul/local/lib/python2.7/site-packages/zuul/merger/merger.py", line 110, in reset
    repo.head.reference = origin.refs['HEAD']
  File "/usr/share/python/zuul/local/lib/python2.7/site-packages/git/util.py", line 706, in __getitem__
    raise IndexError("No item found with id %r" % (self._prefix + index))
IndexError: No item found with id u'origin/HEAD'

Event Timeline

Paladox subscribed.

I thought ytterbium.wikimedia.org is no more? I thought that's cobalt now?

Good catch. That is a copy paste from the old setup, we now use gerrit.wikimedia.org

Did a basic attempt at https://review.openstack.org/#/c/432477/ . But really I dont think we can easily mimic git clone :/

Mentioned in SAL (#wikimedia-releng) [2017-04-04T21:26:47Z] <hashar> contint2001 : rm -fR /srv/zuul/git/mediawiki/services/graphoid/deploy due to T157818

Mentioned in SAL (#wikimedia-releng) [2017-04-04T21:29:17Z] <hashar> contint1001 : rm -fR /srv/zuul/git/mediawiki/services/graphoid/deploy due to T157818

Mentioned in SAL (#wikimedia-releng) [2019-07-09T15:29:39Z] <thcipriani> contint{1,2}001: rm -rf /srv/zuul/git/mediawiki/services/restbase due to T157818

Mentioned in SAL (#wikimedia-releng) [2019-07-10T13:30:20Z] <hashar> on zuul-merger: rm /srv/zuul/git/operations/software/gerrit due to T157818 # T189549

That got addressed upstream with https://review.opendev.org/c/zuul/zuul/+/787451 Support overlapping repos and a flat workspace scheme It might theoretically be ported back to our version but I don't think we should bother. A workaround is to prepopulate the zuul-merge repositories with all Gerrit hosted repositories so we can control the namespace clashes. But that would surface whenever a colliding repository is created later on so not ideal.

Given it is a rare occurrence on the zuul-merger have been warmed up and I have documented the workaround in the task description, I am declining it.

When bringing a new zuul-merger instance I went to populate all active code git repositories with:

gerrit ls-projects --type CODE --state ACTIVE | \
  xargs -I{} -n1 bash -c '
    mkdir -p "$(dirname {})"
    git clone "ssh://jenkins-bot@gerrit.wikimedia.org:29418/{}" \
        "/srv/zuul/git{}"

The only clashes are for:

  • wikimedia/fundraising/crm/drupal (since wikimedia/fundraising/crm has a data directory)
  • analytics/reportcard/data (since analytics/reportcard has a data directory)