Page MenuHomePhabricator

Decision request - gitlab cloud repos merge strategy
Closed, ResolvedPublic

Assigned To
Authored By
aborrero
Oct 27 2022, 12:35 PM
Referenced Files
F35705672: image.png
Nov 4 2022, 9:29 AM
F35705667: image.png
Nov 4 2022, 9:29 AM
F35639857: image.png
Oct 27 2022, 12:35 PM
F35639841: image.png
Oct 27 2022, 12:35 PM
F35639826: image.png
Oct 27 2022, 12:35 PM
F35639794: image.png
Oct 27 2022, 12:35 PM

Description

Problem

Scope: this only applies to our repos in gitlab, mostly everything under https://gitlab.wikimedia.org/repos/cloud

There are 3 available merge strategies in gitlab.w.o:

image.png (377×997 px, 57 KB)

By default, on repository creation, the first option is enabled.

The merge-commit strategy was made famous mostly by github, but it comes with a cost: it pollutes the git history with meaningless merge commits.

Constraints and risks

None. I don't think there is any particular risk or constraints associated with this decision request.

Decision record

https://wikitech.wikimedia.org/wiki/Wikimedia_Cloud_Services_team/EnhancementProposals/Decision_record_T321798_gitlab_cloud_repos_merge_strategy

Options

Each option is pretty well explained by gitlab itself, so I'm using screenshots.

Option 1

Keep merge-commit strategy.

image.png (603×1 px, 69 KB)

See also https://gitlab.wikimedia.org/help/user/project/merge_requests/methods/index.md#merge-commit

Pros:
  • Default
  • Preserves the development history
  • Preserves deployability (every commit in the main history line is one single change)
  • History line is not mixed with development commits (like 'addressing comment X')
Cons:
  • History is not linear

Option 2

Use merge-commit with semi-linear history.

image.png (572×998 px, 90 KB)

See also https://gitlab.wikimedia.org/help/user/project/merge_requests/methods/index.md#merge-commit-with-semi-linear-history

Pros:
  • Preserves deployability (every commit in the main history line is one single change)
  • History is linear
  • History line is not mixed with development commits (like 'addressing comment X')
  • Preserves the development history
Cons:
  • Not default

Option 3

Use fast-forward-only merges.

image.png (731×1 px, 163 KB)

See also https://gitlab.wikimedia.org/help/user/project/merge_requests/methods/index.md#fast-forward-merge

Pros:
  • History is linear
  • Preserves the development history
Cons:
  • Not default
  • History line is mixed with development commits (like 'addressing comment X')
  • Does not preserve deployability (most commits are only part of a change, hard to tell when the change starts/stops)

Option 4

Use squash + merge fast-forward

Setup the same as option 3 + squashing all the commits into one before
(https://docs.gitlab.com/ee/user/project/merge_requests/squash_and_merge.html)

Pros:
  • History is linear
  • Preserves deployability (each commit is a change)
  • History line is not mixed with development commits (like 'addressing comment X')
Cons:
  • Not default
  • Does not preserve the development history

Event Timeline

I prefer fast-forward, partly because it's what gerrit already does, and partly because IMO it trains users to have good habits: Think one patchset at a time, inspect merge results yourself before submitting, etc.

On the other hand, one of the advantages of gitlab is that it's 'familiar' to existing github users. Will changing to fast-forward-only be baffling to those users?

+1 to fast-forward-only merges being the preferred configuration. I think the implication of this for people submitting merge requests is only that their work may have to be manually rebased on the target branch before being accepted.

I prefer fast-forward, partly because it's what gerrit already does, and partly because IMO it trains users to have good habits: Think one patchset at a time, inspect merge results yourself before submitting, etc.

+1

On the other hand, one of the advantages of gitlab is that it's 'familiar' to existing github users. Will changing to fast-forward-only be baffling to those users?

Gitlabs provides clear message if a patch needs rebasing. And the option for doing so automagically if possible. The repos covered in this decision are for infrastructure. I bet the merge strategy is just an anecdote for people sending patches to those repos.

I think that the statements:

pollutes the git history for no added value.

and

clean history, no useless merge commits.

Are subjective, so I request either removing them or giving some clearer explanation.

I do think that there's added value in the git history, specially for option 2, where you still have the linear history (like option 3), but you keep each commit in the main line "feature complete", so you should be able to go back to any of them without the application breaking.

This is specially relevant for the workflows that gitlab/github provide (branching with many commit per change) vs what gerrit does (one commit per change).

The extra rebase needed is exactly the same than fast-forwarding (and both are automated if there's no conflicts), but it preserves the "deployability" mentioned above.

I prefer fast-forward, partly because it's what gerrit already does, and partly because IMO it trains users to have good habits: Think one patchset at a time, inspect merge results yourself before submitting, etc.

The exact equivalent to what gerrit does would be "squash + fast forward" that seems not to be an option listed above (see https://docs.gitlab.com/ee/user/project/merge_requests/squash_and_merge.html), that creates a single commit out of the whole branch (1 change per commit) and puts that on the main branch.

I vote for Option 2, and, I'd change the description above to something like:

Option 1

Pros:
  • Default
  • Preserves the development history
  • Preserves deployability (every commit in the main history line is one single change)
  • History line is not mixed with development commits (like 'addressing comment X')
Cons:
  • History is not linear

Option 2

Pros:
  • Preserves deployability (every commit in the main history line is one single change)
  • History is linear
  • History line is not mixed with development commits (like 'addressing comment X')
  • Preserves the development history
Cons:
  • Not default

Option 3

Pros:
  • History is linear
  • Preserves the development history
Cons:
  • Not default
  • History line is mixed with development commits (like 'addressing comment X')
  • Does not preserve deployability (most commits are only part of a change, hard to tell when the change starts/stops)

Option 4

Squash + fast-forward merge

Pros:
  • History is linear
  • Preserves deployability (each commit is a change)
  • History line is not mixed with development commits (like 'addressing comment X')
Cons:
  • Not default
  • Does not preserve the development history

I agree with @dcaro that the "benefits" are subjective, so +1 for rewarding them in a more neutral manner.

The exact equivalent to what gerrit does would be "squash + fast forward"

I think this (what @dcaro called "Option 4") is actually supported by GitLab and would be Option 3 plus the squash setting. This is similar to what GitHub calls squash and merge and is also my favorite option.

On the other hand, one of the advantages of gitlab is that it's 'familiar' to existing github users. Will changing to fast-forward-only be baffling to those users?

GitHub too allows different merge strategies, and it's possible to configure a repo to use only one or some of them. Both Github's "squash and merge" and "rebase and merge" are fast-forward only, so I don't thing this would feel weird for users coming from GitHub.

I think this (what @dcaro called "Option 4") is actually supported by GitLab and would be Option 3 plus the squash setting. This is similar to what GitHub calls squash and merge and is also my favorite option.

+1 for this option, as it emulates what Gerrit does. I think the "con" here of developmental history not being preserved is actually a pro in disguise, because of what Andrew already pointed out – it fosters good workflow discipline:

I prefer fast-forward, [...] because IMO it trains users to have good habits: Think one patchset at a time, inspect merge results yourself before submitting, etc.

I vote for Option 2, and, I'd change the description above to something like:

Please feel free to update the options in the task description.

I vote for Option 2, and, I'd change the description above to something like:

Please feel free to update the options in the task description.

Done 👍

After a bit of discussion about how gitlab workflows will differ from gerrit workflows, I'm now leaning towards Squash + fast-forward, for the same reason that I mentioned above: it encourages users to think in terms of single feature patches or branches rather than just a free-flowing branch containing whatever. Of course there could also be best practices docs someplace to explain this.

Something to note is that with the squash workflow, things like having one commit to format the code, and then the next to do functional changes would either have to be done in different branches, one dependent on the other (that gets really tricky really fast), or mixed into the same branch, that when squashing, will combine both commits (though it's after review, but will make looking back to them harder).

Something to note is that with the squash workflow, things like having one commit to format the code, and then the next to do functional changes would either have to be done in different branches, one dependent on the other (that gets really tricky really fast), or mixed into the same branch, that when squashing, will combine both commits (though it's after review, but will make looking back to them harder).

The options for merge commit and squashing are distinct in the configuration for a GitLab repo. This means that a repo can be set to fast-forward only merge and also have squashing as "encouraged" (option checked by default). This would allow human judgement to choose to un-check the squash option for a given merge request.

aborrero moved this task from Inbox to Needs discussion on the cloud-services-team (Kanban) board.

Votes so far are:

  • option 1: nobody
  • option 2: david
  • option 3: bryan, arturo
  • option 4: francesco, slavina, andrew

Both options 3 & 4 means configuring the repository for fast-forward-only merges, i.e, having this:

image.png (305×875 px, 46 KB)

So that point should be clearly decided already.

Additionally, option 4 (the winner) means enabling the squash option (in a different config section than the merge strategy):

There are 3 methods for squashing. I propose we use the "Encourage" one, which still allows to merge pull requests composed of nice independent commits, but will default to squash.

image.png (275×883 px, 35 KB)

If voters of option 4 doesn't reject this "Encourage" mode (vs the "Require" mode) in the next few days then we can consider this decision request completed and done.

I like the "Encourage" option for squashing.

On a similar topic, I found a related discussion in MediaWiki. They rightly point out that one specific Gerrit workflow (chains of patchsets) is hard to replicate in GitLab. I think that as a replacement both options mentioned by @dcaro (different branches, and multiple commits in a single branch) might work, but we'll have to experiment to find pros and cons of each.

If voters of option 4 doesn't reject this "Encourage" mode (vs the "Require" mode) in the next few days then we can consider this decision request completed and done.

The process would call for a meeting for discussion if there's no unanimous agreement

I am 100% fine with 'encourage'

Scheduled meeting for 2022-11-22.

Not sure if I can make the meeting. Though if I cannot, I vote for option 4. I feel that each merge into the main branch should be a single feature, and thus fit into a single commit. This allows for discussion of a single feature at a time, and if necessary, the removal of a single commit should the feature need rolled back. If a patch seems to be for more than one feature, each feature should have its own code review and thus own merge. So I'll vote for option 4.

I don't think I'll be able to make the meeting either. In case it was unclear from previous comments, my vote also goes to option 4.

If voters of option 4 doesn't reject this "Encourage" mode (vs the "Require" mode) in the next few days then we can consider this decision request completed and done.

We had a meeting and this has been decided.

aborrero claimed this task.