Changes voted CR+2 do not enter gate-and-submit unless Verified is +2.
Closed, ResolvedPublic

Description

With the Zuul upgrade earlier today ( 2.0.0-327-g3ebedde-wmf2precise1 ) , changes which are Verified:+1 no more enter gate-and-submit.

The workaround is to vote Verified+2 and CR+2 which let Zuul accept the patch.

I think the root cause is when I changed the label to be upper case with: https://gerrit.wikimedia.org/r/#/c/226220/1/zuul/layout.yaml . That unbroke a feature that never worked previously which is that Zuul looks at the change submit record to verify whether a patch can actually be merged. If not, Zuul does not bother entering the change in the pipeline.

An example for a job having Verified: +1 (edited):

gerrit query 223572 --submit-records
change I265913dcb28a09b2ba803a226ee03b96fc543ab9
  project: mediawiki/core
  branch: master
  url: https://gerrit.wikimedia.org/r/223572
  open: true
  status: NEW
  submitRecords:
    status: NOT_READY
    labels:
      label: Verified
      status: NEED
    labels:
      label: Code-Review
      status: NEED

The change is not ready because neither Verified nor Code-Review received the necessary votes. On OpenStack a change needs both to be set before the gate bother dealing with the change.

Potentially, when we had the lower case verified the status was being ignored somehow (have to test). Now that is upper case Verified and Code-Review Zuul behavior works as intended but ends up being a regression for us.

Maybe we can soften the approval in gate-and-submit to allow changes with Verified -1,0,+1,+2 to enter the gate. Have to be tested as well.

The relevant function is in integration/zuul.git branch debian/precise-wikimedia then: zuul/trigger/gerrit.py Gerrit.canMerge() . Pasting it here as a convenience:

def canMerge(self, change, allow_needs):
    if not change.number:
        self.log.debug("Change has no number; considering it merged")
        # Good question.  It's probably ref-updated, which, ah,
        # means it's merged.
        return True
    data = change._data
    if not data:
        return False
    if 'submitRecords' not in data:
        return False
    try:
        for sr in data['submitRecords']:
            if sr['status'] == 'OK':
                return True
            elif sr['status'] == 'NOT_READY':
                for label in sr['labels']:
                    if label['status'] == 'OK':
                        continue
                    elif label['status'] in ['NEED', 'REJECT']:
                        # It may be our own rejection, so we ignore
                        if label['label'].lower() not in allow_needs:
                            return False
                        continue
                    else:
                        # IMPOSSIBLE
                        return False
            else:
                # CLOSED, RULE_ERROR
                return False
    except:
        self.log.exception("Exception determining whether change"
                           "%s can merge:" % change)
        return False
    return True

Should be easy to exercise in integration/config test suite.

hashar created this task.Jul 22 2015, 4:12 PM
hashar updated the task description. (Show Details)
hashar raised the priority of this task from to Needs Triage.
hashar added a subscriber: hashar.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 22 2015, 4:12 PM

The reason for that behavior is OpenStack requires a change to be all green (two CR+2, one Approval+1 and Verified+2) before it can enter the gate. Their gate pipeline takes hours to process changes, they dont want it to be overloaded with patches that did not pass tests yet.

OpenStack 'gate' pipeline has:

require:
  open: True
  current-patchset: True            
  approval:
    - verified: [1, 2]
      username: jenkins             
    - workflow: 1 
trigger:
  gerrit:
    - event: comment-added
      approval:
        - workflow: 1
    - event: comment-added
      approval:
        - verified: 1
      username: jenkins

Our 'gate-and-submit' has:

trigger:
  gerrit:
    - event: comment-added
      email:
       - ^(?!l10n-bot@translatewiki\.net).*$
      approval:
        # Zuul EventFilter normalizes approvals to lower case T106436
        - code-review: 2

Not sure what require is for, might a way to change the canMerge() behavior. workflow+1 is the equivalent of our CR+2 to land a change. I am not sure what is the default behavior when there is no require section.

When a change enters gate-and-submit, Zuul reset the Verified score to 0:

start:
  gerrit:
    Verified: 0

Then when verifying whether the change can merge (in canMerge()), it is set to ignore scores that have been changed by the pipeline itself. That is the bit of code:

elif label['status'] in ['NEED', 'REJECT']:
    # It may be our own rejection, so we ignore
    if label['label'].lower() not in allow_needs:
        return False
    continue

Where allow_needs contains the labels set in start. For gate-and-submit: [ 'Verified' ]. That does not match here because the submit record is normalized to lower(), thus 'verified' is not found, False returned and the change considered to not be ready for a merge.

The reason I changed the labels to be upper case is because gerrit review --label verified=-1 causes a SQL error (T106596). But that causes a bunch of other side effects in Zuul which really expect it to be lower case everywhere in its code.

So potentially we can revert the Zuul faulty code at https://review.openstack.org/#/c/138132/

And revert our layout.yaml file back to lower case.

Change 226415 had a related patch set uploaded (by Hashar):
Revert "Fix passing labels to Gerrit when they are not defined in All-Projects"

https://gerrit.wikimedia.org/r/226415

hashar moved this task from Backlog to Bugs on the Zuul board.Jul 22 2015, 9:07 PM
hashar triaged this task as Unbreak Now! priority.
hashar set Security to None.

Change 226415 merged by Hashar:
Revert "Fix passing labels to Gerrit when they are not defined in All-Projects"

https://gerrit.wikimedia.org/r/226415

Change 226532 had a related patch set uploaded (by Hashar):
Fix gerrit review case mismatch errors with labels

https://gerrit.wikimedia.org/r/226532

Change 226532 merged by Hashar:
Fix gerrit review case mismatch errors with labels

https://gerrit.wikimedia.org/r/226532

hashar added a comment.EditedJul 23 2015, 3:20 PM

The new package is at https://people.wikimedia.org/~hashar/debs/zuul_2.0.0-327-g3ebedde-wmf3precise1/

I have extracted both packages with dpkg-deb -x and did a diff:

 colordiff -ur {wmf2,wmf3}/usr/share/python/zuul/local/lib/python2.7/site-packages/zuul
diff -ur wmf2/usr/share/python/zuul/local/lib/python2.7/site-packages/zuul/lib/gerrit.py wmf3/usr/share/python/zuul/local/lib/python2.7/site-packages/zuul/lib/gerrit.py
--- wmf2/usr/share/python/zuul/local/lib/python2.7/site-packages/zuul/lib/gerrit.py	2015-02-05 15:46:17.000000000 +0000
+++ wmf3/usr/share/python/zuul/local/lib/python2.7/site-packages/zuul/lib/gerrit.py	2015-07-23 14:50:19.000000000 +0000
@@ -120,7 +120,7 @@
             if v is True:
                 cmd += ' --%s' % k
             else:
-                cmd += ' --label %s=%s' % (k, v)
+                cmd += ' --%s %s' % (k, v)
         cmd += ' %s' % change
         out, err = self._ssh(cmd)
         return err

The package is on gallium.wikimedia.org at /root/zuul_2.0.0-327-g3ebedde-wmf3precise1_amd64.deb and can be installed with dpkg -i /root/zuul_2.0.0-327-g3ebedde-wmf3precise1_amd64.deb.

Once upgraded we might need to change back the verified/code-review in layout.yaml to lower case again.

Can be reverted by installing the current version which is at /root/zuul_2.0.0-327-g3ebedde-wmf2precise1_amd64.deb

Change 226679 had a related patch set uploaded (by Hashar):
zuul: revert label back to all lower case

https://gerrit.wikimedia.org/r/226679

Change 226679 merged by jenkins-bot:
zuul: revert label back to all lower case

https://gerrit.wikimedia.org/r/226679

hashar closed this task as Resolved.Jul 24 2015, 8:46 AM
hashar claimed this task.

I had to include a revert patch in our Zuul deployment. As of 2.1.0-151-g30a433b-wmf1precise1 it is no more needed since upstream rewrote a large part of how Zuul interact with Zuul and it is not affected anymore.

Restricted Application added a subscriber: TerraCodes. · View Herald TranscriptJun 2 2016, 12:29 PM