Page MenuHomePhabricator

Unable to log in to Netbox
Open, MediumPublic

Description

For some reason I'm not able to log in to Netbox, instead I get a (for some reason unstyled) "Application Not Authorized to Use CAS" error page:

image.png (353×1 px, 35 KB)

The CAS logs are not much more helpful, either:

2024-08-30 16:45:19,972 WARN [org.apereo.cas.services.RegisteredServiceAccessStrategyUtils] - <Cannot grant access to service [netbox_oidc]; it is not authorized for use by [Majavah].>
2024-08-30 16:45:19,974 ERROR [org.apereo.cas.support.oauth.web.endpoints.OAuth20AuthorizeEndpointController] - <Cannot grant service access netbox_oidc to Majavah
	RegisteredServiceAccessStrategyUtils.java:ensurePrincipalAccessIsAllowedForService:149
	RegisteredServiceAccessStrategyAuditableEnforcer.java:byServiceAndRegisteredServiceAndPrincipal:148
	RegisteredServiceAccessStrategyAuditableEnforcer.java:lambda$execute$2:190
>

The main page at https://idp.wikimedia.org/login does list cn=nda,ou=groups,dc=wikimedia,dc=org in memberOf so I should have the right permissions.

Details

Event Timeline

SLyngshede-WMF triaged this task as Medium priority.

Netbox is limited to the groups "ops" and "wmf", seems a little weird that CAS would error out like that though.

Netbox is limited to the groups "ops" and "wmf", seems a little weird that CAS would error out like that though.

I was looking at T302870 there which seems to suggest nda did indeed previously have access? Can/should we therefore add the nda group to it? Or do we understand properly what changed since the earlier task was closed?

do we understand properly what changed since the earlier task was closed?

Maybe this?

https://gerrit.wikimedia.org/r/c/operations/puppet/+/932231

do we understand properly what changed since the earlier task was closed?

Maybe this?

https://gerrit.wikimedia.org/r/c/operations/puppet/+/932231

Taavi re-added the group in bb989a1c77e3cd34b844dd19b5f352efd043716a. I'm not sure what's wrong either.

Taavi re-added the group in bb989a1c77e3cd34b844dd19b5f352efd043716a. I'm not sure what's wrong either.

Actually you're right - it's ops that isn't there now, but both 'wmf' and 'nda' is in the latest production branch.

Netbox is configured to

SOCIAL_AUTH_ALLOW_GROUPS = ['ops', 'wmf']

so we might want to add nda there. I just think it's a little strange that it would trigger an error on CAS and not Netbox

Change #1070563 had a related patch set uploaded (by Slyngshede; author: Slyngshede):

[operations/puppet@production] C:netbox: Allow NDA group to access Netbox.

https://gerrit.wikimedia.org/r/1070563

Netbox is configured to

SOCIAL_AUTH_ALLOW_GROUPS = ['ops', 'wmf']

so we might want to add nda there. I just think it's a little strange that it would trigger an error on CAS and not Netbox

That should not be the problem here as my account is in the ops group.

Latest update: Today I was able to log in, but it created a new account for me with 92e685794a4e4e52b5dbdcf56e629045 as the username instead of any of the usernames associated with my developer account.

The weird account name is due to the "taavi" user already existing in the netbox database. The Django OIDC module (social-app-django) Netbox uses creates the user in the Django/Netbox database, and then links the accounts, if it can't find a matching existing account.

In the default configuration, which we use, the account matching is done using the email address. You new login has a different email address, so social-app sees this as a different account and can't allocate your preferred username ( as provided via OIDC ) and the generates a random username.

In our case it would most likely be safe, perhaps safer, to do the mapping based on the username/uid. It's not supported behavior, so we would need to create a new pipeline module.

@ayounsi / @cmooney how do you feel about linking account based on username/uid?

No objection to that. Seems like a good idea. In the short term we can delete the old account too.

Alternatively: Manually link the correct account in the database.

from social_django.models import UserSocialAuth
from users.models.users import User

u = User.objects.get(id=156)
q = UserSocialAuth.objects.get(user_id=328)
q.user = u
q.save()

Not entirely comfortable with that based on past experiences, but it "should" work.

Looking at the data for taavi UID linkning won't work as the preferred_username and uid doesn't match, so we might be limited to manual fixing the account. We should probably test this on netbox-next first.

I'll lookup the relevant data and see if anything is missing.

After being blocked on this and a few questions and pings on IRC with no response I've decided to fix this myself. What I did in the database was this:

netbox=# select id, username from users_user where email like '%taavi.wtf';
 id  |             username             
-----+----------------------------------
 156 | taavi
 328 | 92e685794a4e4e52b5dbdcf56e629045
(2 rows)

netbox=# select id, provider, uid, user_id from social_auth_usersocialauth where user_id in (156, 328);
 id | provider |   uid   | user_id 
----+----------+---------+---------
 64 | oidc     | Majavah |     328
(1 row)

netbox=# update social_auth_usersocialauth set user_id = 156 where id = 64;
UPDATE 1

After that I'm logged in with my old account just fine.

After being blocked on this and a few questions and pings on IRC with no response I've decided to fix this myself. What I did in the database was this:

netbox=# select id, username from users_user where email like '%taavi.wtf';
 id  |             username             
-----+----------------------------------
 156 | taavi
 328 | 92e685794a4e4e52b5dbdcf56e629045
(2 rows)

netbox=# select id, provider, uid, user_id from social_auth_usersocialauth where user_id in (156, 328);
 id | provider |   uid   | user_id 
----+----------+---------+---------
 64 | oidc     | Majavah |     328
(1 row)

netbox=# update social_auth_usersocialauth set user_id = 156 where id = 64;
UPDATE 1

After that I'm logged in with my old account just fine.

@taavi please do not edit Netbox's DB manually, it could lead to data inconsistencies for the Netbox application that could start crashing because some business logic didn't get applied. There is a user page in Netbox that can be used for this kind of actions as it would ensure that all the Netbox business logic and any other changes to other tables is performed.

@SLyngshede-WMF Please review the list of users with a UID as username (they also don't have a first/last name) in https://netbox.wikimedia.org/users/users/ (just sort by first/last name) in order to fix them. I can see 8 of them right now. If needed we could just delete the user in Netbox if the CAS integration will do the right thing at next login.

@Volans Looking at the borked users, I think a safer option is to simply rename those users in the Netbox UI.

Unlike Taavis user these aren't linked incorrectly, but the social auth library/plugin decided that their username wasn't available and generated one for them. Before doing anything I would like to understand why, because it seems like it's the past seven users this has happened to. I've also checked other users who signed in to Netbox for the first time after we switched to the OIDC authentication and they all look correct.

We might want to re-scope this ticket to be more specific than "Unable to log in to Netbox". Hard to tell if it can be resolved or not. Maybe "Taavi and Southparkfan unable to log into Netbox", then ask them if their issues are fixed for them? Or mention the root cause in the title?

Change #1070563 abandoned by Slyngshede:

[operations/puppet@production] C:netbox: Allow NDA group to access Netbox.

Reason:

New group added specifically for netbox-read-only-access

https://gerrit.wikimedia.org/r/1070563

As a member of the netbox-readonly-access, I can still not log in to NetBox. Looks like I am experiencing the same redirect loop described in T373702#10449880:

https://idp.wikimedia.org/oidc/oidcAuthorize?client_id=netbox_oidc&redirect_uri=https://netbox.wikimedia.org/oauth/complete/oidc/&state=[state]&response_type=code&nonce=[nonce]&scope=openid profile email groups openid profile email (why have the latter three values been duplicated?)
https://netbox.wikimedia.org/oauth/complete/oidc/?code=[code]&state=[state]&nonce=[nonce]
https://netbox.wikimedia.org/login/?next=/
https://netbox.wikimedia.org/oauth/login/oidc/?next=/
https://idp.wikimedia.org/oidc/oidcAuthorize?client_id=netbox_oidc&redirect_uri=https://netbox.wikimedia.org/oauth/complete/oidc/&state=[state]&response_type=code&nonce=[code]&scope=openid profile email groups openid profile email
https://netbox.wikimedia.org/oauth/complete/oidc/?code=[code]&nonce=[nonce] (I'm not sure where &state= has gone)

(and so forth)

I am not sure, but I think the redirect from /oauth/complete/oidc/ to /login/?next=/ is unintended. @MoritzMuehlenhoff did not see anything out of the ordinary on LDAP level.

The uWSGI logs in Logstash (link for one of the related URLs) do not contain a smoking gun. Could it be an issue in the database, similar to Taavi's?