Page MenuHomePhabricator

GitLab users with only provider=cas3 identies are not found when Striker attempts to create GitLab repostories
Closed, ResolvedPublicBUG REPORT

Description

The production deployment of Striker is now configured to search for GitLab users associated with "openid_connect" provider keyed by the Developer account cn property (username). Developer accounts which were attached to GitLab via the prior "cas3" provider will not be found until they receive an "openid_connect" provider association. These associations could be back-filled via the GitLab API, but they will happen automatically as folks re-authenticate to GitLab so maybe we do not need an active backfill process?


original title: Maintainer 'Ross Mallett' unable to create new GitLab repositories connected to the 'milhistbot' tool using Striker

Be that as it may,

https://toolsadmin.wikimedia.org/tools/id/milhistbot/repos/create

Still gives me that error:

Screenshot 2023-12-11 at 10.28.35 am.png (427×1 px, 49 KB)

Only Ross Mallett is listed as a maintainer at

https://toolsadmin.wikimedia.org/tools/id/milhistbot/maintainers/

This issue was first reported in T311466#9392107. Further investigation occurred in T353036: Requesting GitLab account activation for Ross Mallett.

Details

Other Assignee
dancy
TitleReferenceAuthorSource BranchDest Branch
Add fix-auth-provider scriptrepos/releng/gitlab-settings!56dancymain-I56b954768647cd3335d99ffe61ebf45808410091main
Customize query in GitLab

Event Timeline

bd808 changed the task status from Open to In Progress.Dec 11 2023, 5:34 PM
bd808 claimed this task.
bd808 triaged this task as Medium priority.
bd808 created this task.
bd808 edited projects, added GitLab (Integrations); removed GitLab.

This is basically the lookup algorithm that is failing when Striker tries to create a new repo connected to milhistbot:

>>> from striker.tools.models import Tool
>>> from striker import gitlab
>>>
>>> tool = Tool.objects.get(cn="tools.milhistbot")
>>> maintainers = tool.maintainers()
>>> maintainers
<QuerySet [<Maintainer: Ross Mallett>]>
>>>
>>> gitlab_client = gitlab.Client.default_client()
>>> gitlab_maintainers = gitlab_client.user_lookup(maintainers)
@cee: {"@timestamp": "2023-12-11T16:33:50.342Z", "@version": "1", "message": "Failed to lookup user 'rmallett'", "host": "cloudweb1003", "path": "/srv/app/striker/gitlab.py", "tags": [], "type": "striker", "level": "ERROR", "logger_name": "striker.gitlab", "request_id": "none", "stack_trace": "Traceback (most recent call last):\n  File \"/srv/app/striker/gitlab.py\", line 119, in user_lookup\n    )[0]\nIndexError: list index out of range\n", "lineno": 121, "process": 8, "thread_name": "MainThread"}

The "Failed to lookup user 'rmallett'" response is confusing because we can see https://gitlab.wikimedia.org/rmallett in the GitLab web UI. Apparently something unintuitive is happening inside gitlab_client.user_lookup:

def user_lookup(self, users):
    """Lookup GitHub user data for a list of LdapUser objects."""
    users = list(filter(None, users))
    r = {}
    for user in users:
        try:
            r[user.uid] = self.get(
                "users",
                {
                    "provider": settings.GITLAB_PROVIDER,
                    "extern_uid": settings.GITLAB_EXTERN_FORMAT.format(
                        user
                    ),
                }
            )[0]
        except (APIError, IndexError):
            logger.exception("Failed to lookup user '%s'", user.uid)
    return r

Those settings values expand to:

>>> from django.conf import settings
>>> settings.GITLAB_PROVIDER
'openid_connect'
>>> settings.GITLAB_EXTERN_FORMAT
'{0.cn}'
>>> settings.GITLAB_EXTERN_FORMAT.format(maintainers[0])
'Ross Mallett'

That should make the GitLab API call: https://gitlab.wikimedia.org/api/v4/users?provider=openid_connect&extern_uid=Ross%20Mallett. Calling that manually shows that the return value is [], an empty list.

What does the GitLab User API response look like for the provider and extern_uid values of the https://gitlab.wikimedia.org/rmallett account then?

"identities":[{"provider":"cas3","extern_uid":"rmallett"}],

Ah ha! This account was created before T320390: migrate gitlab away from the CAS protocol and apparently it has not logged into GitLab again since the OIDC protocol switch was made. This is a variation of T343485: Striker's GitLab account lookup broken as a result of ODIC migration where we are looking at an old account that is not found with the new search criteria rather than the specifics of the prior bug where we were failing to lookup because of a new account not matching the old search criteria.

@Hawkeye7 You should be able to fix this problem specifically for your https://gitlab.wikimedia.org/rmallett account by logging into gitlab.wikimedia.org using that account. The process of logging in with a fresh session should update the GitLab database to associate the account with the expected provider and extern_uid values that will allow Striker to find the account.

Past-bd808 knew this was a potential problem, but present bd808 forgot:

The production deployment of Striker is now configured to search for GitLab users associated with "openid_connect" provider keyed by the Developer account cn property (username). Developer accounts which were attached to GitLab via the prior "cas3" provider will not be found until they receive an "openid_connect" provider association. These associations could be back-filled via the GitLab API, but they will happen automatically as folks re-authenticate to GitLab so maybe we do not need an active backfill process?

The GitLab API does not allow me to search for all provider=cas3 legacy records easily, but I think I need to understand how many accounts currently have this provider=cas3 issue to better reason about the impact of continuing to hope from T343485 that organic session expiration would fix them.

bd808 renamed this task from Maintainer 'Ross Mallett' unable to create new GitLab repositories connected to the 'milhistbot' tool using Striker to GitLab users with only provider=cas3 identies as not found when Striker attempts to create GitLab repostories.Dec 11 2023, 6:41 PM
bd808 renamed this task from GitLab users with only provider=cas3 identies as not found when Striker attempts to create GitLab repostories to GitLab users with only provider=cas3 identies are not found when Striker attempts to create GitLab repostories.
bd808 updated the task description. (Show Details)
$ export GITLAB_HOST=gitlab.wikimedia.org
$ glab api users --paginate > paged-users.json
$ jq -s 'add' paged-users.json > users.json
$ jq '.|length' users.json
1094
$ jq 'map(select(any(.identities[]; .provider == "cas3") and all(.identities[]; .provider != "openid_connect")))' users.json > cas3-users.json
$ jq '.|length' cas3-users.json
671

It looks like there are really quite a lot of GitLab accounts that were initially created with provider=cas3 that have not authenticated since we switched to provider=openid_connect. Even when excluding accounts that are not marked as state=active it is still 620 accounts which is almost 60% of the created accounts.

I tried logging in to gitlab as Ross Mallett at

https://idp.wikimedia.org/login

The message I get (in green) says:

Log In Successful

You, Ross Mallett, have successfully logged into the Central Authentication Service. However, you are seeing this page because CAS does not know about your target destination and how to get you there. Examine the authentication request again and make sure a target service/application that is authorized and registered with CAS is specified.
Attribute Value(s)
cn [Ross Mallett]
mail [hawkeye7@gmail.com]
memberOf [cn=tools.milhistbot,ou=servicegroups,dc=wikimedia,dc=org, cn=project-tools,ou=groups,dc=wikimedia,dc=org]
sshPublicKey [ssh-rsa AAAAB3NzaC1yc2EAAAADAQABA blah blah
uid [rmallett]

I can go to https://gitlab.wikimedia.org/rmallett. Plenty of activity recorded there!

But at
https://gitlab.wikimedia.org/rmallett/tool-milhistbot

No activity since 2021???

Why does it look like there are two repos? Should just be one.

I still get the same error (above) when I attempt to create a repo

I tried logging in to gitlab as Ross Mallett at https://idp.wikimedia.org/login
[...snip...]
You, Ross Mallett, have successfully logged into the Central Authentication Service. However, you are seeing this page because CAS does not know about your target destination and how to get you there. Examine the authentication request again and make sure a target service/application that is authorized and registered with CAS is specified.

If you literally started at https://idp.wikimedia.org rather than https://gitlab.wikimedia.org/ this would be expected. You can think of idp.wikimedia.org as being similar to login.wikimedia.org. It is a place that you will pass through when logging in to other Developer account related services, but there isn't much to do there directly.

The "sign in" button on gitlab links to https://gitlab.wikimedia.org/users/sign_in?redirect_to_referer=yes which then redirects you to https://idp.wikimedia.org/login with a number of request parameters that tell IdP where to redirect you after you have authenticated. This is what the warning/error message on IdP is telling you; it wants to send you on, but has no idea where you are supposed to be headed.

Why does it look like there are two repos? Should just be one.

You created https://gitlab.wikimedia.org/rmallett/tool-milhistbot on 2021-11-25 manually and then separately created https://gitlab.wikimedia.org/toolforge-repos/milhistbot on 2022-09-07 using Striker. The repo created by Striker is probably the better one to keep as it will be easier for other Wikimedians to collaborate with you from.

I still get the same error (above) when I attempt to create a repo

https://gitlab.wikimedia.org/api/v4/users?provider=openid_connect&extern_uid=Ross%20Mallett now returns the expected data for your https://gitlab.wikimedia.org/rmallett account. The gitlab_client.user_lookup call inside Striker also now finds the rmallett account as would be hoped.

I was just now able to create https://toolsadmin.wikimedia.org/tools/id/milhistbot/repos/id/1912 & https://gitlab.wikimedia.org/toolforge-repos/milhistbot-bd808-test using my superpowers in Striker. The resulting GitLab repository has https://gitlab.wikimedia.org/rmallett as a direct member and owner added by StrikerBot. This is contrary to the state described in T353036#9395325 and copied to this bug's initial description.

I can't think of any caching service being used by Striker or GitLab that would have interfered with Striker seeing your updated GitLab account information immediately, but maybe I'm missing something? Could you try again now so we can have another data point to consider?

I ran the fix-auth-provider script and updated about 667 accounts.

I ran the fix-auth-provider script and updated about 667 accounts.

I took a look at the 3 that are left:

Note that all 3 of these have a GitLab id that is formed by 1 appended to their Developer account uid which seems strange and possibly related to their failure to be picked up by the fix-auth-provider script.

I just tried adding the openid_connect provider identity with extern_uid echidnalives to the echidnalives1 account and it was rejected: Extern uid has already been taken

I just tried adding the openid_connect provider identity with extern_uid echidnalives to the echidnalives1 account and it was rejected: Extern uid has already been taken

https://gitlab.wikimedia.org/api/v4/users?provider=openid_connect&extern_uid=echidnalives finds the https://gitlab.wikimedia.org/echidna account which is connected to the uid=echidna,ou=people,dc=wikimedia,dc=org Developer account. This is ok however because the provider=openid_connect's extern_id is supposed to match the cn of the Developer account record. For https://gitlab.wikimedia.org/echidnalives1 that means Sam B and not echidnalives. I have now made this connection using the /admin/users/{uid}/identities web UI.

<aside>I have a strong hunch that both of the uid=echidnalives,ou=people,dc=wikimedia,dc=org and uid=echidna,ou=people,dc=wikimedia,dc=org Developer accounts have been created by the same human. I see quite a bit of this in helping folks with Toolforge and Cloud VPS issues. I assume it typically arises from either confusing instructions found on wikis or the user forgetting that they have an existing Developer account. In this case as the account creation dates are only about a month apart my sense making is confusing instructions, but I certainly could be wrong about all of this supposition.</aside>

All 3 accounts from T353176#9401422 have now been manually fixed. My check from T353176#9398402 now finds no accounts without provider=openid_connect data. \o/ Thank you @dancy for doing the hard bits of that for me!

I am going to call this task {{Done}} per "My check from T353176#9398402 now finds no accounts without provider=openid_connect data."


@Hawkeye7 If you continue to be unable to use Striker and your "Ross Mallett" Developer account to create additional git repos for the https://toolsadmin.wikimedia.org/tools/id/milhistbot tool, please open a new bug and ping me on it. At the moment I do not know what would keep the software from working for you, so if it continues to fail we will have some deeper digging to do together.

@bd808 I can confirm that it is working now. Thanks for your assistance in resolving this problem.

@bd808 I can confirm that it is working now. Thanks for your assistance in resolving this problem.

\o/ Excellent. @dancy did the hard work to keep this particular frustration from happening to others in the future as well. It's messy, but I think this is what progress looks like. ;)

This gitlab repo and the associated record in Striker have now been deleted.