Page MenuHomePhabricator

Define an equivalent to Gerrit's +-1 +-2 for code review evaluation
Closed, DeclinedPublic

Description

It is not possible to "-1" or to "+2" a diff in Phabricator exactly like you would do in Gerrit. Then again, we seem to have all the pieces in place.

We need:

  • Possibility for All Users to add positive/negative comments and symbols (+1 and -1 are in fact symbols, not votes, since the decision over a review is not based on counting votes).
  • Possibility for the maintainers of a repository and only for them to reject or approve a diff, the later action implying the merge of the diff (triggering whatever CI process we have in place).
  • Possibility for anybody to abandon a diff, and recover an abandoned diff.

To be clear about policies: users should not be able to merge or reject a diff in a repo, unless they are the maintainers.

Are there other scenarios related with evaluation of diffs that we should consider?

Details

Reference
fl229

Event Timeline

flimport raised the priority of this task from to Medium.Sep 12 2014, 1:32 AM
flimport set Reference to fl229.

qgil wrote on 2014-04-28 15:42:39 (UTC)

"Award token" allows reviewers to give thumbs up, thumbs down, and 14 tokens more, some of them with positive/negative connotations, some of them clearly ambiguous. This is not as discrete as +1 / -1, but is this bad?

thiemowmde wrote on 2014-04-29 14:38:35 (UTC)

"Award token" allows reviewers to give thumbs up, thumbs down [...]

The so called "tokens" currently don't feel like a useful replacement. In Gerrit there is no vague stuff like a gray medal. What does this even mean? Green pluses and red minuses arranged in a table, very easy to understand and extremely quick to read. With the "tokens" we get a row of hard to distinguish icons, some with no meaning. Is there a "token" that blocks a change from being merged? Does a thumb down go away if a new patch set is uploaded? What if an issue got many negative icons initially but changed a lot since then? How to get rid of the negative votings? Is everyone supposed to remove his icon? This will lead to patches that get stuck in limbo and the only solution is to abandon it and upload the exact same code a second time in an other patch. Not that this doesn't happen in Gerrit - it happens a lot -, but if this is the same in Phabricator it's not better.

qgil wrote on 2014-04-29 18:27:42 (UTC)

Silly question: are the actions "Accept revision" and "Request changes" equivalent to +1/-1 (recommendations) or am I indeed merging/rejecting a patch (+2/-2)?

See {D8} for some playground test.

I evaluated a patch with my test dummy @qgiltest user to make sure that no admin permission mixed the results.

Rush wrote on 2014-04-29 18:39:34 (UTC)

Silly question: are the actions "Accept revision" and "Request changes" equivalent to +1/-1 (recommendations) or am I indeed merging/rejecting a patch (+2/-2)?

@Qgil

Hi! So I'm going to try to explain this in gerrit=>phab terms and hopefully I make sense.

If you want to say "this review needs changes"

gerrit => -1
phab => request changes

If you want to say this review is good to go as is:

gerrit => +2
phab => accept revision

If you want to say "I like this" but it's not workflow relevant (not sure how else to say it?):

gerrit => +1 or comment
phab => comment

There is another difference I think. Let's say you upload a diff via phab with 2 reviewers. If one of them accepts it's good to go. But if one of them requests changes, that person must accept for it to be considered ready to land. It doesn't stop you but it does warn you. Basically, anyone who requests changes must also then accept the differential, whereas if no one requests changes any of the reviewers can accept. This means that if two guys are reviewing and they both have separate concerns, they basically only come in after it's updated and review the concern they raised has been handled, and overlapping accepts can't back burner someone's raised concern.

clear as mud?

epriestley wrote on 2014-04-29 18:40:11 (UTC)

I'm not a Gerrit expert, so some of the details here might be slightly off, but:

  • "Accept" is like "+2".
  • "Request Changes" is like "-2".
  • Herald can also add "Blocking Reviewers", which are sort of like a "-2 by default". These reviewers must review a change before it will become "Accepted". This is often used to make sure a codebase owner, security team, etc., are involved in certain changes which match configurable criteria.
  • Reviewers can be removed from any state, even "Blocking". This is recorded on the transaction record to prevent abuse, and provides an escape hatch for stale rejects, someone going on vacation, etc.

There is no formal equivalent of "+1" or "-1" (i.e., nonbinding accept/reject). Existing installs generally use tokens and comments to express soft preferences or express appreciation or support for a change without performing review.

Rush wrote on 2014-04-29 18:43:23 (UTC)

I haven't actually used -2 in gerrit yet, so maybe that's just not a big part of normal ops workflow? But either way same idea.

qgil wrote on 2014-04-30 16:01:07 (UTC)

From an external point of view this looks more like a change of habit than an actual problem. The situation to solve is

  1. Maintainers must decide whether a patch is merged, pushed back with feedback, or abandoned for good.
  2. In order to do that, they welcome feedback from anyone. However, not everybody has the same expertise, neither they get the same trust. How maintainers distribute trust is often subjective (Mary is an expert, Peter has no clue, John fooled me once...) or ad-hoc (comment #3 makes sense, comment #5 is clueless...)

Gerrit solves this situation with a very clear distinction between regular reviewers (who can +1/-1/comment harmlessly) and maintainers with +2/-2 permissions. As far as I know, in Gerrit only admins can grant +2 permissions, by default they are given for specific repos, and we have a full fledged process to handle this.

Here, with our current set of permissions, I was surprised that someone like me was authorized to accept and merge a revision in a project that I don't even belong to. I was also surprised that I could nominate reviewers that will be able to merge a patch because, again, I'm a good example of a user not qualified to decide such things. Both in Gerrit and here you can CC users, and this is all I would expect to be allowed to do + comment with a +1/-1 type of recommendation. But this is a problem of permissions, unrelated to the +1/-1 discussion.

Even if Phabricator doesn't have +1/-1, the thumbs-up / thumbs-down tokens could be used for the same purpose (if you are serious, otherwise you can choose a bunch of additional icons). One significant difference is that in Gerrit evaluation and comment go together, you can see the evaluation of a users in the summary of a comment. Here you need a little extra effort to perceive comments as positive or negative, and to relate tokens with users commenting.

All in all I think it is not a big deal. We can figure out a simple process adapted to our needs, and if we have strong opinions about something (e.g. add token from comment) then we submit a specific request upstream and/or work on patch.

thiemowmde wrote on 2014-04-30 17:04:03 (UTC)

@Qgil The difference between Gerrit and Phabricator is:

  • In Gerrit a +/-1 is for a patch set (of possibly many patch sets in a patch). Here it is for a task (with possibly many diffs).
  • In Gerrit a +/-1 is reseted if a new patch set is uploaded. Here it's not (to be precise, I assume it's not).
  • In Gerrit a -1 literally means "there is a problem with this patch set, please improve". Here it's more like "I disagree in general (that's why my thumb down is on the task and not on the current diff) and suggest to close, please don't spend more time on this".
  • In Gerrit a +1 means "I think this can be merged as it is, but I'm either not 100% sure or don't have the right". In Phabricator a thumb up is more like "I agree we should continue working on this task, but not necessary with the current diff".

In my opinion the thumb up is a good replacement for the +1. I like the idea that it's not removed when a new diff is uploaded. If my general impression was "I like this, please merge" why should this change in the next diff?

But the thumb down is no replacement for the -1. It's more like a -2. I'm afraid people will confuse this and downvote patches when they should "Request Changes" (as @Rush pointed out). This can lead to frustration and will probably make us ignore the thumbs completely. I'm sure not everybody will constantly revisit everything and remove downvotes if the issue is fixed in a later diff. Even worse, there is no way to give a reason for a downvote.

mattflaschen wrote on 2014-05-01 00:12:23 (UTC)

thiemowmde:

In Gerrit a +/-1 is for a patch set (of possibly many patch sets in a patch). Here it is for a task (with possibly many diffs).

Tokens (e.g. Thumbs Up) can be awarded for both tasks and differentials (equivalent to a Gerrit change). I added one to D1 as a test.

qgil wrote on 2014-05-01 00:44:52 (UTC)

In Gerrit a +/-1 is reseted if a new patch set is uploaded. Here it's not (to be precise, I assume it's not).

We have the option to reset or not, see T226: LCMD - set up on wmflabs

mattflaschen wrote on 2014-05-01 03:20:59 (UTC)

I think "Accepted" is more analogous to +2.

thiemowmde wrote on 2014-05-01 09:50:53 (UTC)

Tokens (e.g. Thumbs Up) can be awarded for both tasks and differentials

@mattflaschen I'm afraid we are all confused a lot by the different terms. Scroll down to the "Revision Update History". There is a list of "Diffs". This looks to me like the "Patch Sets" in Gerrit. In Gerrit a +/-1 is for a "Patch Set" and gets reseted when a new "Patch Set" is uploaded. In Phabricator I can't find a way to vote on a "Diff".

qchris wrote on 2014-05-01 12:28:23 (UTC)

[ “Accept Change” vs gerrit votes ]

What “Accept Change” means and how it relates to Gerrit's votes
depends a bit on how we integrate CI with Phabricator and who gets
permission to merge commits (maybe it's only Jenkins that can merge).

Depending on the Jenkins integration, “Accept Change” could become
equivalent to

  • CodeReview+1,
  • CodeReview+2, or
  • CodeReview+2 and Submit, if Jenkins allows it.

But I agree with others that “Award Token” is confusing and no
replacement for Gerrit's votes.

Especially, since you can only use one kind of token. So it'll get
hard to express “CodeReview-1, Verified+1”.

And you have to hover over tokens to see who awarded them (See
D21. Who awarded the brown coin, and who the grey one?)

And awarding/rescinding the tokens is not visible in the time line
(On D21, did qchris2 award a brown coin once? And if so what was his
comment back then?)

And the tokens do not get removed when uploading a new Differential
Revision to an existing Differential.

qchris wrote on 2014-05-01 12:43:15 (UTC)

I didn't find the possibility to restrict the permissions for accepting (merging, +2) to certain users, and therefore this paragraph above seems to describe the only way of defining maintainers. Can someone confirm?

In the "edit" part of repository management^W^WDiffusion, there is a “Pushable By” section. One can restrict the “All Users” there to some group of users.

But that does not match our current use case, as it's per project and not per branch.

When I had a look a few days ago, I asked in Phabricator IRC channel, and have been pointed to

https://secure.phabricator.com/Q49

for per branch permissions.

So it's again Herald. And it looks a bit cumbersome to me. I wanted to try it out back then, but fab.wmflabs.org told me that I lack permission to do so.

qgil wrote on 2014-05-01 15:19:33 (UTC)

Herald can also add "Blocking Reviewers", which are sort of like a "-2 by default". These reviewers must review a change before it will become "Accepted". This is often used to make sure a codebase owner, security team, etc., are involved in certain changes which match configurable criteria.

I didn't find the possibility to restrict the permissions for accepting (merging, +2) to certain users, and therefore this paragraph above seems to describe the only way of defining maintainers. Can someone confirm?

This looks like a fragile approach for a project with hundreds of repos where each repo has its own maintainers. Do we need to rely in each maintainer creating their own Herald rule? What if I nominate myself blocker reviewer of a repository where I'm nobody? Do they really need to wait for me to accept patches?

How to recreate +/-1 +/-2 is almost a trivial problem compared with this hyperhorizontal setup, if it is true that anybody can accept a patch unless a blocker reviewer (who could also be anybody) stops them.

qgil wrote on 2014-05-02 05:44:18 (UTC)

I have edited the description with the intention of adding some structure to this complex discussion, seeing where are the pain points, and decide what to do about them. I'm not a code review process expert! Your edits to the description are welcome.

cscott wrote on 2014-05-02 15:48:32 (UTC)

Someone asked for more details about how -2 are used in gerrit. Two main use cases: first, "I found a bug which would break the world if this patch were accepted as is, absolutely do not merge this." It is a stronger -1, mostly to prevent accidental merging. Relatedly, -2 is used to say "I am the author and this patch is not ready to go yet (or I'm going to significantly revise it), feel free to read it but I'm going to proactively prevent anyone from accidentally merging it." Both of these prevent accidents.

The other main use case is to put a proactive hold on a patch, "this needs higher-level review" or "don't commit this until I've had a chance to review it".

-2s are sticky, which is the primary benefit compared to -1s.

scfc wrote on 2014-05-06 20:57:00 (UTC)

@thiemowmde:

In my opinion the thumb up is a good replacement for the +1. I like the idea that it's not removed when a new diff is uploaded. If my general impression was "I like this, please merge" why should this change in the next diff?

Because there is no requirement that a patchset will improve just by uploading a different one :-).

Qgil renamed this task from No more voting +1, or -1 on Code Review to Define an equivalent to Gerrit's +-1 +-2 for code review evaluation.Oct 9 2014, 9:55 PM
Qgil updated the task description. (Show Details)
Qgil removed a project: Wikimedia Phabricator RfC.
Qgil set Security to None.
Qgil subscribed.

Task refactored; please review the title and description, and have your say.

One thing seems to be confusing people a lot - that is, differential doesn't merge your changeset for you. Once the diff is accepted, someone still has to push the change to the repo. So Differential isn't going to grant the world access to a repository allowing everyone to bypass maintainers. It's a code review tool not a gatekeeper.

Qgil lowered the priority of this task from Medium to Low.Nov 27 2014, 7:57 AM

Currently #Code-Review is in the queue of Phabricator subprojects after Project-Management and Gitblit-Deprecate. If someone wants to work on this task, take it and assign the priority accordingly.

@mmodell: I started reading all the task in Gerrit-Migration, so i'm just beginnign to understand Differential :) If i understand your statement correct, that's the workflow (as an example):

  1. User A writes a new change in mediawiki/core
  2. User A push the change for review (i'm still not sure, how, like gerrit with git push HEAD:refs/for/master?)
  3. User A adds reviewers to the diff (or they get added by herald or someone else) [1]
  4. User B reviews my change
    1. Gives a -1 (how?), it has to be reverted when User A uploads a new change
    2. Gives a +1 (how?) it has to be reverted when User A uploads a new change (think about: What if patch set 2 contains a new/other problem, which wasn't present in patch set 1? This shouldn't have to "auto +1")
    3. Gives a -2 (how?) the change can not be merged
    4. Gives a +2 (how?)
  5. Given for a +2, the change would be trigger some tests in jenkins now (in case the project has some tests in jenkins; wmf hosted extensions has jenkins jobs) and jenkins will send a Verified +1 (i don't know a field for that? How could we indicate, that jenkins tests fails like in gerrit (easily without scrolling down all the comments); a field needs to be reverted when a new patch set uploaded) and submit request to Differential

Now what? Gerrit would merge the change into the master branch of the repo and that's it. In Differential i need to download the diff to my local environment, merge it with the master branch and push it to the git repo? That wouldn't be very good, sometimes i review changes without testing them again [2].

It's possible that i totaly misunderstand anything :)

[1] Side note: As readed in {D1}, it's not possible to ad _myself_ (the author of a diff) as a reviewer, so i have not -2 possibility (which would prevent jenkins, or any other user, from merging/submitting without someone removes my -2)?

[2] Example: I have a change that you've tested and all is working fine, but you -1 it because you want, that there is a comment somewhere; User A uploads a new patchset and in the diff between ps1 and ps2 you see only the comment added, then i want to be able to merge it on any computer without the need of my dev environment, in gerrit I simply add my +2 and let jenkins and gerrit do their jobs, how i can do it in Differential?

@mmodell: I started reading all the task in Gerrit-Migration, so i'm just beginnign to understand Differential :) If i understand your statement correct, that's the workflow (as an example):

  1. User A writes a new change in mediawiki/core
  2. User A push the change for review (i'm still not sure, how, like gerrit with git push HEAD:refs/for/master?)
  3. User A adds reviewers to the diff (or they get added by herald or someone else) [1]
  4. User B reviews my change
    1. Gives a -1 (how?), it has to be reverted when User A uploads a new change
    2. Gives a +1 (how?) it has to be reverted when User A uploads a new change (think about: What if patch set 2 contains a new/other problem, which wasn't present in patch set 1? This shouldn't have to "auto +1")
    3. Gives a -2 (how?) the change can not be merged
    4. Gives a +2 (how?)
  5. Given for a +2, the change would be trigger some tests in jenkins now (in case the project has some tests in jenkins; wmf hosted extensions has jenkins jobs) and jenkins will send a Verified +1 (i don't know a field for that? How could we indicate, that jenkins tests fails like in gerrit (easily without scrolling down all the comments); a field needs to be reverted when a new patch set uploaded) and submit request to Differential

Note: This stuff isn't entirely set in stone. Phabricator is pretty flexible and configurable. It doesn't try to prescribe the "one true way" for code review. Gerrit is a lot more opinionated. So some of this stuff can be adjusted.

  1. you use arc diff, for example:

To submit a single commit:

arc diff HEAD^

To submit everything on your branch since it branched from master:

$ git checkout my-feature-branch
$ arc diff master
  1. Yes, herald can add reviewers. Additionally, you can add reviewers via differential web interface or by adding them in your commit message with a line like Reviewers: username
  1. Phabricator doesn't have +1/+2 or -1/-1. There is only accept, request changes, or comment.
    1. Request changes. This is the same as C
    2. There is no +1, you can award a thumbs up token or just write a comment. In gerrit, +1 doesn't really do anything, it's just a comment. So phabricator doesn't explicitly offer the fairly meaningless option.
    3. Request changes.
    4. Accept the diff. Once a voting reviewer accepts the review it can be merged

Now what? Gerrit would merge the change into the master branch of the repo and that's it. In Differential i need to download the diff to my local environment, merge it with the master branch and push it to the git repo? That wouldn't be very good, sometimes i review changes without testing them again [2].

We could implement merging. It's not gerrit that merges changes, it's actually jenkins/zuul that does. Personally I really hate that it does and would highly prefer that someone explicitly merged them. The auto merging prevents me from accepting a change with the intention of merging it later or leaving it to someone else to do the merging. One option might be to comment with +merge or something to trigger the auto-merge, that would suit me anyway, as well as addressing your use-case: you could merge explicitly, without accessing your development environment.

Also, evan explained the +1/+2/-1/-2 stuff probably better than I have:

In T138#1978, @flimport wrote:

epriestley wrote on 2014-04-29 18:40:11 (UTC)

I'm not a Gerrit expert, so some of the details here might be slightly off, but:

  • "Accept" is like "+2".
  • "Request Changes" is like "-2".
  • Herald can also add "Blocking Reviewers", which are sort of like a "-2 by default". These reviewers must review a change before it will become "Accepted". This is often used to make sure a codebase owner, security team, etc., are involved in certain changes which match configurable criteria.
  • Reviewers can be removed from any state, even "Blocking". This is recorded on the transaction record to prevent abuse, and provides an escape hatch for stale rejects, someone going on vacation, etc.

There is no formal equivalent of "+1" or "-1" (i.e., nonbinding accept/reject). Existing installs generally use tokens and comments to express soft preferences or express appreciation or support for a change without performing review.

Phabricator is pretty flexible and configurable. It doesn't try to prescribe the "one true way" for code review. Gerrit is a lot more opinionated.

You keep saying it, but I personally feel Differential breaks the workflow I'm used to in several ways.

There is no formal equivalent of "+1" or "-1" (i.e., nonbinding accept/reject). Existing installs generally use tokens and comments to express soft preferences or express appreciation or support for a change without performing review.

Tokens are much more ambiguous than +1/-1, different users might follow very different criteria for them.

@Ricordisamoa:

You keep saying it, but I personally feel Differential breaks the workflow I'm used to in several ways.

But how exactly does it break your workflow? Details would be helpful.

There are two ways of dealing with that.

  1. Give it a chance and adjust to a slightly different workflow than what you are used to.
  2. Adjust arc/diff to more closely match your ideal workflow.

Tokens are much more ambiguous than +1/-1, different users might follow very different criteria for them.

I disagree, I think +1 is at least as ambiguous as thumbs-up

I'm not really with you. +1 is exactly, what the tooltip says, i'm using it a lot (i think it's a lot), still if i have +2 rights in this project, just to show: "Jey, i'm fine with that, what does other people say?" Without the need of writing a comment.

And as far as i can see (in {D1}), tokens aren't reseted after a new patchset is uploaded like in gerrit, and iirc, you can't remove your token?! So what, if i review ps1 and it looks good and I give athumbs up, now the user uploads a ps2 which is completly different and i wouldn't thumb up this new version, what should i do? Add a new token?

Ok, maybe it's an argument, that automerging stuff isn't the best idea for _some_ reasons, so i ask: where can i submit/merge a change in Differential? It' possible in gerrit (after verified and code-review +2), and jenkins/zuul should use the same submit command (via ssh as far as i know).

So, Differential in it's actual state would (in my mind) break, or at least disturb, our workflow a lot.

Ok, i can remove a token, sorry for the confusion. But it's still not reseted automativally, so the problem could be, that people doesn't look at the date, when i added a thumbsup and missunderstood it as a +1 to code i wouldn't submit :/

This is an impossible thing to generalize AFAIK as teams use +1/+2 differently.

It seems to me that you are just looking for things to worry about. I've said it a bunch of times, differential doesn't prevent us from implementing auto-merge with jenkins. I don't personally like auto-merge but that's got nothing to do with phabricator. The +1 vs tokens is a very minor nit to pick. You can always just make a comment with "+1" or "lgtm" and nothing more. That has exactly the same effect as posting code-review +1 in gerrit.

At the moment, at https://phabricator.wikimedia.org/differential/query/advanced/ there doesn't seem to be a way to search for differentials with no, positive, negative, etc. tokens.

You can always just make a comment with "+1" or "lgtm" and nothing more. That has exactly the same effect as posting code-review +1 in gerrit.

And i think in this point we differ in some way (maybe it's just personally), but comments doesn't help really to get a quick overview over a change, so i have to go through all the comments and i have to look, if the "+1" was given for an old patchset or the new one. That doesn't really help.

The +1 vs tokens is a very minor nit to pick.

I think, it's not, but that's my opinion.

It seems to me that you are just looking for things to worry about.

Definitively not. I'm just trying to say, what I think we need before we can move from gerrit to phabricator to not produce make more problems as we (maybe) have with gerrit.

Looks like upstream is working on a feature to merge in the web UI: https://secure.phabricator.com/D13022

greg raised the priority of this task from Low to Medium.Nov 4 2015, 7:27 PM
greg moved this task from To Triage to Backlog on the Differential board.

"Request Changes" is like "-2".

Well it's more like a -1 as it's the usual way to send back the patch to the author so it sees it on the UI as needing to be fixed.

A comment without "Request changes" would let the patch in the "needs review" state, so author won't see it in its Action required view, but in the "Blocked by others" view.

Maybe a problem in case of "request changes": Everyone who can edit the revision can remove the reviewer (who request changes).

Technical restrictions only serve to combat vandalism. If everyone is acting in good faith, we shouldn't have to worry about someone removing a reviewer who requested changes.

Also, phabricator has blocking reviewers, if we choose to use them.

Current thinking, and what had seeming consensus at the Developer Summit this year: Recreating Gerrit's system in Differential "just because" is not a goal. The two systems work differently for reasons that are explained by the documentation for the systems.

Closing per greg's comment and developer summit consensus.

In T138#2086156, @greg wrote:

Recreating Gerrit's system in Differential "just because" is not a goal.

How about switching to Differential's system "just because"?

How about switching to Differential's system "just because"?

That does not sound like a good idea/"reason", hence I do not see anybody doing that in Wikimedia...