Page MenuHomePhabricator

git commit hash remains the same from submitting to after merge
Closed, DuplicatePublic

Description

With gerrit the git commit hash remains the same from submitting for review to after merge. (The review-id is inserted before submit.) The existence of T612 seems to make it possible that arcanist and differential might not be supporting this yet.

Event Timeline

JanZerebecki raised the priority of this task from to Needs Triage.
JanZerebecki updated the task description. (Show Details)
JanZerebecki added a project: Gerrit-Migration.
JanZerebecki added a subscriber: JanZerebecki.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 3 2015, 6:18 PM

T137 seems to also be an indicator for this.

Also an indicator T136 and maybe T162.

Aklapper triaged this task as Low priority.Mar 13 2015, 9:43 PM
greg moved this task from To Triage to Tooling on the Gerrit-Migration board.Dec 2 2015, 12:32 AM
greg set Security to None.
mmodell closed this task as Declined.Mar 31 2016, 4:49 PM
mmodell added a subscriber: mmodell.

This is simply not possible with the current state of differential.

If it currently is not possible, then why did you decline the ticket?

JanZerebecki reopened this task as Open.Apr 10 2016, 9:20 PM

It's not clear from this task why this is desirable or even at all relevant.

Git works by being content addressable. There is no justification given why a review system would need to change the content arbitrarily. That in turn moves more trust from the ends to a central location. Thus it likely decreases security and makes security improvements harder. Overall that this is not supported is a big design smell, I think any good design of a review system that can work with git will have this property in its minimally usable implementation.

What part of the content does differential change? And why is that necessary?

That in turn moves more trust from the ends to a central location.

The way git is used in wikimedia space always moves trust to a central location. We have reviewers, and the reviewers are the ones that give the yes-or-no, and from that point on a commit is part of the repository. What happens beforehand, and whether commit hashes change or not (it's common for reviewers to rebase a patch already, for example) is irrelevant.

Thus it likely decreases security and makes security improvements harder.

{{citation needed}}. Our security model is not based on commit hashes.

As for what changes:
Compare the original submitted patch
https://secure.phabricator.com/D10823?id=25983

to the version in the repository:
https://github.com/phacility/phabricator/commit/ce55bb1d9630f35a9ab4dee2dced03536e484efd

In other words, it adds meta-information that Gerrit only exposes through the web interface. Pick your poison -- I'd rather have more information in the commit message and less in the code review location.

That in turn moves more trust from the ends to a central location.

The way git is used in wikimedia space always moves trust to a central location. We have reviewers, and the reviewers are the ones that give the yes-or-no, and from that point on a commit is part of the repository. What happens beforehand, and whether commit hashes change or not (it's common for reviewers to rebase a patch already, for example) is irrelevant.

If I make a local commit and that gets merged in gerrit I can know it was not changed by comparing the commit hash. If a reviewer rebases a patch with gerrit on a repo with the setting "merge if necessary", it is unnecessary and they destroy some history.

Thus it likely decreases security and makes security improvements harder.

{{citation needed}}. Our security model is not based on commit hashes.

Commit hashes that remain the same reduce the amount of information that is not yet reviewed by a participant. If you want to improve on the security of trusting one central git server, one possibility is to sign commits which doesn't work if they get changed while passing review.

As for what changes:
Compare the original submitted patch
https://secure.phabricator.com/D10823?id=25983
to the version in the repository:
https://github.com/phacility/phabricator/commit/ce55bb1d9630f35a9ab4dee2dced03536e484efd

Thank you, that helped. Though it is hard to see what else beside the message changed, as I don't have your original commit.

In other words, it adds meta-information that Gerrit only exposes through the web interface. Pick your poison -- I'd rather have more information in the commit message and less in the code review location.

No Gerrit adds that information in git notes in a specific ref (that is not cloned by default). If you choose git displays that in git log appended to the commit message. Git notes are things that attach additional information to an existing commit without changing the history of that branch. Another possibility is adding that information in a merge commit. Git has various features to make history preserving changes (in the sense of a cryptographically secured append only log, think bitcoin block chain), there is no need to pick a poison.

How does differential work when the pull request has multiple commits?

greg moved this task from To Triage to Backlog on the Differential board.May 12 2016, 10:37 PM

If I make a local commit and that gets merged in gerrit I can know it was not changed by comparing the commit hash. If a reviewer rebases a patch with gerrit on a repo with the setting "merge if necessary", it is unnecessary and they destroy some history.

Nonetheless, it's still common and not all of our repos use "merge if necessary"

In other words, it adds meta-information that Gerrit only exposes through the web interface. Pick your poison -- I'd rather have more information in the commit message and less in the code review location.

No Gerrit adds that information in git notes in a specific ref (that is not cloned by default).

It's true that Gerrit exposes reviews as git notes, however, it's also true that gerrit changes the commit hash by way of updating the commit message to add change-id. The only difference is that arcanist does it in php code and gerrit does it in a git commit hook. Arcanist supports 'immutable' mode of operation which prevents rewriting of commits. I really wouldn't recommend using it though, because that greatly reduces the choices of workflow available to contributors.

How does differential work when the pull request has multiple commits?

The default behavior is to squash everything into a single commit when landing a revision. It's extremely flexible, however, arcanist & differential can be configured to support almost any conceivable workflow.

If I make a local commit and that gets merged in gerrit I can know it was not changed by comparing the commit hash. If a reviewer rebases a patch with gerrit on a repo with the setting "merge if necessary", it is unnecessary and they destroy some history.

Nonetheless, it's still common and not all of our repos use "merge if necessary"

Those are a hassle and it takes me much more time to merge patches there. I can't just add a +2 to multiple patches in such a repo at the same time. I need to wait for each patch to get merged before I can +2 the next one. Because of that knowledgeable users bypass this setting, which makes things more confusing. I don't know of any reason to not use "merge if necessary" that can not be handled during the deployment process instead.

In other words, it adds meta-information that Gerrit only exposes through the web interface. Pick your poison -- I'd rather have more information in the commit message and less in the code review location.

No Gerrit adds that information in git notes in a specific ref (that is not cloned by default).

It's true that Gerrit exposes reviews as git notes, however, it's also true that gerrit changes the commit hash by way of updating the commit message to add change-id. The only difference is that arcanist does it in php code and gerrit does it in a git commit hook.

Those are two different things, what I was speaking about is changes to the commit after "it gets a +2".

Arcanist supports 'immutable' mode of operation which prevents rewriting of commits. I really wouldn't recommend using it though, because that greatly reduces the choices of workflow available to contributors.

https://xkcd.com/1685/

How does differential work when the pull request has multiple commits?

The default behavior is to squash everything into a single commit when landing a revision. It's extremely flexible, however, arcanist & differential can be configured to support almost any conceivable workflow.

While skimming https://secure.phabricator.com/T5000 i got the impression that is not true (think github workflow or even Linus compatible workflow) and that phabricators 'immutable' is not enough, but I have not had time to properly read the comments there.