Page MenuHomePhabricator

SSH Access of Git data in GitLab
Open, MediumPublic

Description

The way GitLab sets up Git access is to create a local git user and either either add a line to authorized_keys file or use the AuthorizedKeysCommand to lookup a key in the local GitLab database.

GitLab does allow ssh to run on a non standard port by setting gitlab_rails['gitlab_shell_ssh_port'] = [alt-port-number] in /etc/gitlab/gitlab.rb.

This becomes complicated on our production machines due to (at least):

  • Puppet manages SSH which would probably not appreciate this system interfering
  • There are bastions involved (which wouldn't necessarily have the keys of folks who registered for developer accounts even though those keys are in ldap).

This ticket is to discuss proposed solutions.

Event Timeline

thcipriani updated the task description. (Show Details)
thcipriani updated the task description. (Show Details)

The hope with this task is to try to surface additional concerns about the proposal so that we can open ssh cloning for the GitLab production VM. @mark and @wkandek specifically mentioned @MoritzMuehlenhoff as a person who might have Opinions™ about this, so adding @MoritzMuehlenhoff explicitly.

Given the ratio between number of people that will be using git SSH access, and people that will log in to manage the system, I would spend an extra minute here to try and stick with the default SSH port for git. Here's a few options I believe may balance out convenience and security concerns:

  1. Running public git+ssh access on the standard SSH port (22), while keeping admin SSH on a non-standard port
  2. Keeping a single SSH daemon, but restricting any account other than git from logging in unless it's an internal IP or coming through a bastion (this can be achieved with a sshd Match option, or a PAM configuration)
  3. Passing gitlab SSH traffic through a separate proxy, which is not ideal, but still a viable solution.

Please let me know your thoughts on this.

I don't think it's complicated at all. It should run fin on a ssh with a Match rule to only allow from external networks the user git (and, while we're at it, forcecommand it there, too).
The part that may be controversial -simple but controversial- is to open port 22 in the firewall to this machine. However, a ssh listening on an alternate port and opening that one is equally bad, should there be a fatal sshd vulnerability.

Using a separate ssh could help in that it can run in a container, and listen on a separate, public interface than the "main" internal ip address. In any case, this machine would be a half-bastion, and should not be able to initiate connections to the internal network (albeit for things like puppet or apt caches it will be needed).

PS: Wouldn't it make more sense to use AuthorizedKeysCommand and fetch the keys directly from LDAP rather than using a authorized_keys file? Surely gitlab supports that...

Gitlab takes a bit of an opposite approach with this. Gitlab server manages its own user key database, as well as they can sync in user keys from LDAP. They do indeed support AuthorizedKeysCommand, but in their own way, to quickly look up a key in a local database, to alleviate issues with huge authorized_keys files. I think this is even a default for some configuration as of lately.

As for the options, personally I like the single SSH daemon with a Match rule restricting only git to public access. It may sound dangerous, but honestly, workarounds do add more complexity while not offering much more security.

Updated the ticket with details about AuthorizedKeysCommand, and removed the proposed solution (here I was, worried this ticket wouldn't spark any debate :)).

Given the ratio between number of people that will be using git SSH access, and people that will log in to manage the system, I would spend an extra minute here to try and stick with the default SSH port for git.
[...]
Please let me know your thoughts on this.

This is good advice. I've been gradually stockholmed to Gerrit's use of port 29418 for SSH access (which I admittedly never remember and had to lookup in my bash history just now).

The solution for using SSH on port 22 becomes controversial due to our use of bastion hosts. My original proposal side-stepped this issue, but at the expense of usability. Medium-/long-term I like the proposal of a single SSH daemon using match rules.

It may be simplest in the short term to split the discussion: what's simple and uncontroversial vs what's ideal and needs more discussion. This setup is meant as a minimum viable product having some technical debt that can be resolved later (i.e., being on port 29418) isn't the end of the world for me and matches with our current setup. An ideal outcome of using port 22 (i.e., a half-bastion to borrow a phrase) would need more discussion.

A second sshd daemon for port 29418 concentrates all configuration for gitlab/ssh access in a separate configuration file and does not touch the "normal" config file. That seems to me a more robust way of configuring, minimizing impact on sshd on port 22. sshd on port 29418 can then simply be limited to user git.

A second sshd daemon for port 29418 concentrates all configuration for gitlab/ssh access in a separate configuration file and does not touch the "normal" config file. That seems to me a more robust way of configuring, minimizing impact on sshd on port 22. sshd on port 29418 can then simply be limited to user git.

Agreed.

I remember having to help people in #mediawiki whose network blocked very high ports like 29418. Maybe this is a silly question, but can we run the GitLab sshd on port 22 and the normal sshd for getting into the server on some other port like 2222?

Also I believe git-ssh.wikimedia.org ran/runs a public sshd on port 22 - how is that working safely?

I remember having to help people in MediaWiki-General whose network blocked very high ports like 29418. Maybe this is a silly question, but can we run the GitLab sshd on port 22 and the normal sshd for getting into the server on some other port like 2222?

I like this. Admins can easily enough use an arbitrary port, but it'd be great not to create a situation where a bunch of users have remotes they're going to have to change at some later date. Let's default to the expected behavior for most users, if we can.

Allow me to stress that the SSH port for Gitlab is a long term choice. Whatever is decided here, will have to be carried for a long time, it will stay in numerous remote origins forever. My +1 to a standard port here.

JMeybohm triaged this task as Medium priority.Mar 3 2021, 8:08 AM

Also I believe git-ssh.wikimedia.org ran/runs a public sshd on port 22 - how is that working safely?

That is Phabricator, not Gerrit and it has the full LVS/pybal setup:

https://config-master.wikimedia.org/pybal/eqiad/git-ssh

This comment was removed by wkandek.

To provide as little obstacles for developers as possible access through port 22 is the preferred option.

I still believe having a separate sshd is more robust and easier to manage. I suggest we add a second external IP to the server that will be gitlab.wikimedia.org and run the web interface and the sshd daemon configured for user git. The normal sshd daemon for admin access will continue to run on gitlab1001.wikimedia.org.

Should it be gitlab.wikimedia.org for both, https and ssh? (so both the webserver and second sshd would listen on that new additonal IP).

Or should it be a different name, like git-ssh.wikimedia.org is for Phabricator?

Change 674439 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/dns@master] drop gitlab CNAME, in favor of service name on separate IP

https://gerrit.wikimedia.org/r/674439

Change 674439 merged by Dzahn:
[operations/dns@master] drop gitlab CNAME, in favor of service name on separate IP

https://gerrit.wikimedia.org/r/674439

+gitlab                                   1H IN A 208.80.154.14                                                                       
+gitlab                                   1H IN AAAA 2620:0:861:1:208:80:154:14

Should it be gitlab.wikimedia.org for both, https and ssh? (so both the webserver and second sshd would listen on that new additonal IP).

Or should it be a different name, like git-ssh.wikimedia.org is for Phabricator?

Gitlab supports having separate host names for web and SSР, and I personally like the flexibility of being able to offload SSH to a load balancer or a proxy when workload increases.

Change 674446 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] gitlab: add gitlab.wikimedia.org service IP with interface::alias

https://gerrit.wikimedia.org/r/674446

Change 674446 merged by Dzahn:
[operations/puppet@production] gitlab: add gitlab.wm.org service IP, with lookup from Hiera

https://gerrit.wikimedia.org/r/674446

gitlab100 has 2 IPs (2x v4, 2 x v6) now:

@gitlab1001:~# ip a s | grep 208
    inet 208.80.154.6/26 brd 208.80.154.63 scope global ens5
    inet 208.80.154.14/32 scope global ens5
    inet6 2620:0:861:1:208:80:154:14/128 scope global deprecated 
    inet6 2620:0:861:1:208:80:154:6/64 scope global dynamic mngtmpaddr

208.80.154.6 is the server (gitlab1001)
208.80.154.14 is the service (gitlab)

@thcipriani @Sergey.Trofimovsky.SF @wkandek exec summary for you:

  • VM has 2 public IPs now, one intended to face the public and one for admins
  • so far both public services HTTPS and SSH would run on the same IP and host name, gitlab.wikimedia.org, there would not be a separate hostname like git(lab)-ssh.wikimedia.org
  • there is ongoing work by John to refactor our puppet sshd code and add new parameters to let it listen on multiple IPs. Since we know that gitlab does NOT come with its own sshd, this will allow us to have a second instance that gitlab can use. This needs thorough code review though and will still be in the Gerrit queue for a little while, partially due to Easter holidays
Dzahn changed the task status from Open to Stalled.Tue, Mar 30, 10:34 PM

https://gerrit.wikimedia.org/r/c/operations/puppet/+/675135 and the entire chain below limited ssh access to the primary address. This unblocked the path to having different configs for different sshds. Thanks to JohnBond!

O:gitlab: restrict gitlab ssh to only listen on the primary ip addresses
(Merged)
C:ssh::server: add support for multiple listen addresses
(Merged)
C:ssh::server: make authorized_keys_file an Arrauy[Stdlib::Unixpath]
(Merged)
C:ssh::server: update parameter types
(Merged)
C:ssh::server: refactor
(Merged)

This ticket is to discuss proposed solutions.

@thcipriani I think this happened and we know the solution now and made it possible. Unless this ticket is also supposed to be for the actual ssh config for gitlab, i think it's done.

If it should stay open until actual users have ssh access then I would now hand it over to Speed & Function.

Dzahn changed the task status from Stalled to Open.Mon, Apr 5, 9:39 PM

boldly assigning this to @Sergey.Trofimovsky.SF now.

There are 2 public IPs on the VM, gitlab1001.wikimedia.org (208.80.154.6 / 2620:0:861:1:208:80:154:6) which you can ssh to for shell access for admins and gitlab.wikimedia.org (208.80.154.14 / 2620:0:861:1:208:80:154:14), which you can use for https and ssh access for end-users.

And because the primary sshd is now NOT listening on all interfaces anymore (thanks to John's work) but only on the one meant for admin access you should be unblocked from running public services on the other IP.

This lets you use gitlab.wikimedia.org for the public and avoids using an "ugly" port like 29418 as requested, while also keeping sshd configs separated.

   inet 208.80.154.6/26 brd 208.80.154.63 scope global ens5
..
    inet 208.80.154.14/32 scope global ens5
..

tcp        0      0 208.80.154.6:22         0.0.0.0:*               LISTEN      0          24600196   25524/sshd

boldly assigning this to @Sergey.Trofimovsky.SF now.

Thanks, got it. I confirm gitlab.wikimedia.org is available for SSH access publicly.

Question: do you guys want to manage this sshd's config file with puppet, like the main one? We can provide suggested safe config to you guys to start with. Alternatively, we can manage it with Ansible together with Gitlab, but it seems a bit out of place there.

Personally I think this (the IP address) would be a good line to do the separation at. We handle the admin access SSH in puppet while you manage the services that are part of the user experience and run on the service IP. Open to comments by others.

Personally I think this (the IP address) would be a good line to do the separation at. We handle the admin access SSH in puppet while you manage the services that are part of the user experience and run on the service IP.

That feels like a pretty clean division of things to me.

Change 677497 had a related patch set uploaded (by Jbond; author: John Bond):

[operations/puppet@production] gitlab: use correct IP addresses

https://gerrit.wikimedia.org/r/677497

Change 677497 merged by Jbond:

[operations/puppet@production] gitlab: use correct IP addresses

https://gerrit.wikimedia.org/r/677497

Thanks, got it. I confirm gitlab.wikimedia.org is available for SSH access publicly.

This highlighted an issues on our side in that ssh was listening on gitlab1001.wikimedia.org for IPv4 but gitlab.wikimedia.org for IPv6. I have now updated our configuration so that ssh now only listens on gitlab1001.wikimedia.org for both IPv4 and IPv6. To clarify further at this point it is expected that gitlab.wikimedia.org will not respond to SSH tcp/22. The intention is that S & F will create a new ssh config and service and run a separate SSH daemon which only listens on gitlab.wikimedia.org:22. Please let me know if more you need more clarification

Question: do you guys want to manage this sshd's config file with puppet, like the main one?

As mentioned there is no need for us to manage this new config in puppet however it would be useful to review the config once stable and before we open things up to the world.

As mentioned there is no need for us to manage this new config in puppet however it would be useful to review the config once stable and before we open things up to the world.

We're making it a part of the Ansible playbook that manages Gitlab installation. I believe you should have access to that soon.

We're making it a part of the Ansible playbook that manages Gitlab installation. I believe you should have access to that soon.

Sounds good to me , thanks