Page MenuHomePhabricator

[toolforge] Investigate authentication
Open, HighPublic

Description

We will want to be able to open the API for users external to the toolforge infrastructure.

This task is to investigate current practices and options for us to do authentication in toolforge.

Currently we are doing ssl authentication using the client certificates that were generated for the tools (living in the NFS shared folders).

Some options to investigate are:

idp.wmcloud.org

This uses CAS as the sso server.

Supports:

  • oauth
  • cas
  • openid
  • saml
  • rest

The data that we get back from the server include the ldap groups in the memberOf key:

memberOf 	[cn=tools.sqlchecker,ou=servicegroups,dc=wikimedia,dc=org, cn=tools.wm-lol,ou=servicegroups,dc=wikimedia,dc=org, cn=tools.jobs,ou=servicegroups,dc=wikimedia,dc=org, cn=project-account-creation-assistance,ou=groups,dc=wikimedia,dc=org, cn=project-cloudvirt-canary,ou=groups,dc=wikimedia,dc=org, cn=project-dumps,ou=groups,dc=wikimedia,dc=org, cn=project-wmflabsdotorg,ou=groups,dc=wikimedia,dc=org, cn=tools.toolschecker,ou=servicegroups,dc=wikimedia,dc=org, cn=tools.cloud-ceph-performance-tests,ou=servicegroups, ...
other projects using it

wikitech

This has no advantage over using idp/keystone

keystone

Has it's own authentication protocol (see docs https://docs.openstack.org/keystone/pike/user/index.html)

It can be federated (use idp underneath) - https://docs.openstack.org/keystone/latest/admin/federation/introduction.html

Related Objects

Event Timeline

I'm trying to understand pros & cons of the different protocols, especially OIDC (OpenID Connect) vs CAS.

I played a bit with OIDC in a previous job, and I remember it as fairly complex but also quite well supported by a number of libraries that can be used to implement a client. I'm not sure how that compares with CAS.

A quick Google search led me to this presentation (although not very recent) with some useful comparisons: https://ldapcon.org/2017/wp-content/uploads/2017/08/16_Cl%C3%A9ment-Oudot_PRE_LDAPCon2017_SSO-1.pdf

I'm trying to understand pros & cons of the different protocols, especially OIDC (OpenID Connect) vs CAS.

I played a bit with OIDC in a previous job, and I remember it as fairly complex but also quite well supported by a number of libraries that can be used to implement a client. I'm not sure how that compares with CAS.

A quick Google search led me to this presentation (although not very recent) with some useful comparisons: https://ldapcon.org/2017/wp-content/uploads/2017/08/16_Cl%C3%A9ment-Oudot_PRE_LDAPCon2017_SSO-1.pdf

Nice!

Just checked the openid python libraries mentioned in the openid page and all have been archived :/

We might be a bit restricted here on what can we actually use underneath though, as in we have our users in LDAP, no matter what, and we have one single-sign-on implementation in prod that we can use (idp.w.o).

My current ideas (still exploring) are:

  • directly LDAP: we want to avoid this, it's what keystone, toolsadmin and wikitech currently do, ldap is kinda flaky and load sensitive
  • keystone (as LDAP proxy): the main downside I currently see without having played much, is that you need many things to authenticate, around 9 settings between domain, user, region, ..., not many libraries (that I have found), and no sso. Big advantage, if we have storage/trove integration, the auth is going to be proobably very similar, so users would be used to it xd
  • idp.w.o (as LDAP proxy): we get sso, the protocol is CAS, kinda easy, probably don't need any extra library, only user-pass, no two-factor auth yet though)
  • keycloak with ldap as federated backend: we have an extra layer, but no sso, we can add local users, have different deployments, etc., requires having the extra service
  • keycloak, with idp.w.o as federated backend: this allows us to add an extra layer on top of idp, we still have sso, but we can for example add local users and such (easier local deployment for example), but requires having an extra service

We also might want to change keystone to authenticate using idp.w.o at some point, to allow sso, if that's the case, we will have to move to idp auth eventually, so we might want to do so already.

I have not tried to setup a local keystone (will try with https://quay.io/repository/openstack.kolla/keystone?tab=tags&tag=latest from today's check-in), or idp instance, if it's easy, we might not gain anything by adding keycloak (as we would be able to run keystone in local deployments for example).

In any case, it seems quite possible that the storage access (s3 buckets) will have to be done through keystone with app credentials, so the users will have to authenticate that way to access non-public buckets. I was thinking on exposing that as part of the 'storage-api' service thingie, like:

> toolforge storage s3 list

+------------------------------------------------+
| name    |   credential | url                   |
+---------+--------------+-----------------------+
| bucket1 | somelonghash | https://url-to-bucket |
...

Depending on how it's implemented, the credentials might not be per-bucket though, but for the whole tool there (app credential associated to the tool user, if I understand correctly... @Andrew please correct me if I'm wrong xd).

In any case, it seems that unless we use some front (ex. keycloak) and adding extra info there, what we end up having is the LDAP data, that is, when you log in (as your user), we get the list of groups you belong to (the tools).

So there's no per-tool authentication token, making T363808: [builds-api] Prefix all endpoints with `/tool/<toolname>` needed (or similar, like passing always a tool parameter, though I'd prefer the path for namespacing) right?

I think using idp.w.o is my favourite solution, as it can potentially be a true "single" sign-on that all applications can rely on. Using CAS without any extra library is probably a good first step, and we can evaluate migrating to OIDC iff we find it provides any advantage.

Just checked the openid python libraries mentioned in the openid page and all have been archived :/

I found a discussion about that here: https://www.reddit.com/r/Python/comments/16pin4l/a_maintained_library_for_oidc_in_python/

Looks like https://github.com/IdentityPython/idpy-oidc is maintained and certified, it includes both a server implementation (which we don't need) and a client.

So there's no per-tool authentication token, making T363808: [builds-api] Prefix all endpoints with /tool/<toolname> needed (or similar, like passing always a tool parameter, though I'd prefer the path for namespacing) right?

I agree, after thinking about it I'm in favour of adding the /tool prefix. We can discuss the implementation in T363808.